Image audit
Overview
The image review service is an intelligent review solution from Baidu AI Cloud, offering multidimensional review capabilities, including pornography detection, violent terrorism recognition, political sensitivity analysis, disgusting content identification, and advertisement recognition. It is applicable in various scenarios, such as gaming, social platforms, forums, lifestyle services, and UGC (user-generated content). For example, game developers can use this service to review custom avatars uploaded by players.
The image review service is seamlessly integrated with BOS, allowing you to access its capabilities through BOS API, SDK, CLI, and other methods.
Charge type
As the call entry of the image review service, BOS only charges for BOS API requests. For details, please refer to [BOS Product Pricing](BOS/Product pricing/Product price/Pay-As-You-Go Charge Type.md).
The fee for the image review service is charged by the image review product. For details, please refer to Image Review Product Documentation.
Enable the image review service
Currently, Baidu AI Cloud supports reviewing images on BOS, including actively calling the image review API and automatic image review. For the automatic image review function, you need to first enable the image review service through the console. After the image review rules take effect, the files you newly upload will be automatically reviewed. You can upload objects to BOS through [console](BOS/Console Operation Guide/Manage object/Upload files.md), [SDK](BOS/Developer Guide/Object Basic Operations/Uploading Data/Simple upload.md), CLI and other methods.
- To enable the image review service on the console, please refer to Setting Up Image Review Service on Console.
- After enabling the image review service, users can call the image review function through the BOS CLI Tool or BOS API. The following describes the API in detail.
API
This API enables access to the image review service.
Note: To use this API, you need to enable the image review service on the console first.
Request
-
Request syntax
Plain Text1POST <ObjectName>?process HTTP/1.1 2Host: <BucketName>.bj.bcebos.com 3Date: <Date> 4Authorization: <AuthorizationString> 5Content-Type: application/json; charset=utf-8 6Content-Length: <ContentLength> 7 8{ 9 "action" : { 10 "sync" : [{ 11 "url" : "$(img-censor)", 12 "parameters" : "<base64_encode(param)>", 13 14 } 15 ] 16 } 17} -
Request parameters
- url: The fixed value is
$(img-censor), which should not be modified. -
Parameters: These are defined by the base64-encoded image review function parameters, which are represented as a JSON string.
Basic structure of image review function parameters:
Plain Text1 { 2 "antiporn" : {}, 3 "terror" : {}, 4 "ocr" : { 5 "detect_direction" : "false", 6 "language_type" : "CHN_ENG", 7 "recognize_granularity" : "big" 8 } 9 }May be an arbitrary combination of multiple sub-services as needed, or a single sub-service such as "pornographic content detection", in which case the parameter is written as
Plain Text1{ 2 "antiporn" : {} 3}The list of sub-services supported by the image review service includes:
- ocr: General optical character recognition
- face: Face detection
- antiporn: Pornography recognition
- politician: Politician recognition
- terror: Violent terrorism recognition
- public: Public figure recognition
- disgust: Disgusting image recognition
- watermark: Watermark QR code recognition
- Quality: Image quality recognition values correspond to underlying service parameters. Refer to the input parameters of the Hetu general optical character recognition (OCR) service.
Detailed explanations of request and response parameters for sub-services are provided below.
- url: The fixed value is
-
Request headers
None
Response
- Response headers None
-
Response element
Refer to the explanations for response parameters of each sub-service.
Example
-
Request example
Plain Text1{ 2 "antiporn": {}, 3 "ocr": { 4 "detect_direction": "false", 5 "language_type": "CHN_ENG", 6 "recognize_granularity": "big" 7 } 8}
After base64 encoding
1eyJhbnRpcG9ybiI6e30sIm9jciI6eyJkZXRlY3RfZGlyZWN0aW9uIjoiZmFsc2UiLCJsYW5ndWFnZV90eXBlIjoiQ0hOX0VORyIsInJlY29nbml6ZV9ncmFudWxhcml0eSI6ImJpZyJ9fQ==
Fill into parameters. The requests sent are as follows:
1POST <ObjectName>?process HTTP/1.1
2Host: <BucketName>.bj.bcebos.com
3Date: <Date>
4Authorization: <AuthorizationString>
5Content-Type: application/json; charset=utf-8
6Content-Length: <ContentLength>
7
8{
9 "action" : {
10 "sync" : [{
11 "url" : "$(img-censor)",
12 "parameters" : "eyJhbnRpcG9ybiI6e30sIm9jciI6eyJkZXRlY3RfZGlyZWN0aW9uIjoiZmFsc2UiLCJsYW5ndWFnZV90eXBlIjoiQ0hOX0VORyIsInJlY29nbml6ZV9ncmFudWxhcml0eSI6ImJpZyJ9fQ==",
13 }
14 ]
15 }
16}
-
Successful response example
Plain Text1HTTP/1.1 200 OK 2Date: Thu, 22 Jun 2017 07:30:56 GMT 3Content-Type: application/json; charset=utf-8 4Content-Length: 237 5Connection: keep-alive 6Server: BceBos 7x-bce-debug-id: MTAuNzUuNzguNDA6VGh1LCAyMiBKdW4gMjAxNyAxNTozMDo1NiBDU1Q6MTg1MDY5NDg3OQ== 8x-bce-request-id: 598f7e18-77fb-424a-bc68-95acb0644076 9{ 10 "result" : { 11 "antiporn" : { 12 "result" : [{ 13 "probability" : 0.000071, 14 "class_name": "Pornography" 15 }, { 16 "probability" : 0.000291, 17 "class_name": "Sexy" 18 }, { 19 "probability" : 0.999638, 20 "class_name": "Normal" 21 } 22 ], 23 "log_id" : 1853911322, 24 "result_num" : 3 25 }, 26 "ocr" : { 27 "log_id" : 2471272194, 28 "words_result_num" : 2, 29 "words_result" : [{ 30 "words" : " TSINGTAO" 31 }, { 32 "words": "Qingdao" 33 } 34 ] 35 } 36 }, 37 "log_id" : 149811665151162 38} -
Failed response example
Plain Text1{ 2 "log_id": 149319909347709, 3 "error_code": 216500, 4 "error_msg": "unknown error" 5}Explanation on failed error_code:
Error code Error message Description 216101 not enough param Insufficient parameters 216102 service not support An unsupported underlying service type is entered 216200 empty imge No image URL 216500 unknown error Unknown error 282804 download image error Image download failed 282000 logic internal error Service logic layer error
ocr
Function
The user sends a request for the service to recognize all characters within an image.
Request parameters
| Parameters | Required or not | Types | Option range | Description |
|---|---|---|---|---|
| language_type | No | string | CHN_ENG、ENG、POR、FRE、GER、ITA、SPA、RUS、JAP、KOR | Recognition language type defaults to CHN_ENG. Available options include: - CHN_ENG: Chinese-English mixed; - ENG: English; - POR: Portuguese; - FRE: French; - GER: German; - ITA: Italian; - SPA: Spanish; - RUS: Russian; - JAP: Japanese; - KOR: Korean. |
| detect_direction | No | boolean | true、false | Specifies whether to detect image orientation, with the default set to no detection (false). Orientation refers to whether the input image is in its normal position or rotated 90/180/270 degrees counterclockwise. Available options include: - true: detect orientation; - false: do not detect orientation. |
| detect_language | No | string | true、false | Specifies whether language detection is required, with no detection set as the default. Currently supports Chinese, English, Japanese, and Korean. |
| probability | No | string | true、false | Whether to return the confidence level for each row in the recognition results |
Response parameters
| Field | Required or not | Types | Description |
|---|---|---|---|
| direction | No | int32 | Image orientation is included when detect_direction=true. Possible values are: -1: Undefined, - 0: Upright, - 1: Rotated 90 degrees counterclockwise, - 2: Rotated 180 degrees counterclockwise, - 3: Rotated 270 degrees counterclockwise. |
| log_id | Yes | uint64 | A unique log ID for issue localization |
| words_result | Yes | array() | Recognition result array |
| words_result_num | Yes | uint32 | Number of recognition results, indicating the number of elements in words_result |
| +words | No | string | Recognition result string |
| probability | No | object | Confidence value for each row in the recognition result, including average: average row confidence, variance: row confidence variance, min: minimum row confidence |
Response example
1{
2"log_id": 2471272194,
3"words_result_num": 2,
4"words_result":
5 [
6 {"words": " TSINGTAO"},
7 {"words": "Tsingtao Beer"}
8 ]
9}
face
Function
- Detects faces in the input image and provides face locations, 72 key point coordinates, and face-related attribute information.
- The detection response time depends on the number of faces in the image. A larger number of faces may slightly slow down the response.
- Typical application scenarios: e.g., Facial Attribute Analysis, Processing Analysis Based on Key Face Points, Facial Marketing Campaigns, etc.
- Specific coordinates are provided for the five sense organs, while the 72 key point coordinates include exact positions but exclude detailed descriptions of their locations.
Request parameters
None
Response parameters
| Parameters | Types | Required or not | Description |
|---|---|---|---|
| log_id | uint64 | Yes | Log ID |
| result_num | uint32 | Yes | Face count |
| result | object[] | Yes | Collection of facial attribute objects |
| +age | double | No | Age information is returned if face_fields includes age. |
| +beauty | double | No | Beauty score ranges from 0 to 100, with higher scores indicating greater beauty. This value is returned when face_fields includes beauty. |
| +location | object | Yes | Position of face in the image |
| ++left | uint32 | Yes | Distance from face region to the left boundary |
| ++top | uint32 | Yes | Distance from face region to the upper boundary |
| ++width | uint32 | Yes | Width of face region |
| ++height | uint32 | Yes | Height of face region |
| +face_probability | double | Yes | Face confidence, ranging 0-1 |
| +rotation_angle | int32 | Yes | Clockwise rotation angle of face frame relative to vertical direction, [-180,180] |
| +yaw | double | Yes | 3D rotation - left/right rotation angle: [-90 (left), 90(right)] |
| +pitch | double | Yes | 3D rotation - pitch angle [-90 (Upper), 90 (Lower)] |
| +roll | double | Yes | In-plane rotation angle [-180 (counterclockwise), 180(clockwise)] |
| +expression | uint32 | No | Expression: 0 for no smile or laugh, 1 for smile, and 2 for laugh. This is returned when face_fields includes expression. |
| +expression_probability | double | No | Expression confidence ranges from 0 to 1 and is provided when face_fields includes expression. |
| +faceshape | object[] | No | Face shape confidence is returned if face_fields includes faceshape. |
| ++type | string | Yes | Face shape: square/triangle/oval/heart/round |
| ++probability | double | Yes | Confidence: 0~1 |
| +gender | string | No | Gender (male or female) is returned when face_fields includes gender. |
| +gender_probability | double | No | Gender confidence ranges from 0 to 1 and is returned if face_fields includes gender. |
| +glasses | uint32 | No | Glasses: 0 for no glasses, 1 for ordinary glasses, and 2 for sunglasses. This is returned when face_fields includes glasses. |
| +glasses_probability | double | No | Glasses confidence ranges from 0 to 1 and is included if face_fields includes glasses. |
| +landmark | object[] | No | Key point positions for the left eye center, right eye center, nose tip, and mouth center are provided when face_fields includes landmark. |
| ++x | uint32 | No | X-coordinate |
| ++y | uint32 | No | Y-coordinate |
| +landmark72 | object[] | No | Seventy-two feature point positions are illustrated and returned when face_fields includes landmark. |
| ++x | uint32 | No | X-coordinate |
| ++y | uint32 | No | Y-coordinate |
| +race | string | No | Race options include yellow, white, black, and Arab, and are returned when face_fields includes race. |
| +race_probability | double | No | Race confidence ranges from 0 to 1 and is returned if face_fields includes race. |
| +qualities | object | No | Face quality information is included if face_fields includes qualities. |
| ++occlusion | object | Yes | Probability of occlusion for facial parts, [0, 1] (pending launch) |
| +++left_eye | double | Yes | Left eye |
| +++right_eye | double | Yes | Right eye |
| +++nose | double | Yes | Nose |
| +++mouth | double | Yes | Mouth |
| +++left_cheek | double | Yes | Left cheek |
| +++right_cheek | double | Yes | Right cheek |
| +++chin | double | Yes | Chin |
| ++blur | double | Yes | Face blurriness, [0, 1]. 0 indicates clear; 1 indicates blurry (pending launch) |
| ++illumination | - | Yes | The value range is [0,255], indicating the lighting level of the facial area (pending launch) |
| ++completeness | - | Yes | Face completeness, [0, 1]. 0 indicates complete; 1 indicates incomplete (pending launch) |
| ++type | object | Yes | Real face/cartoon face confidence |
| +++human | - | Yes | Real face confidence, [0, 1] |
| +++cartoon | - | Yes | Cartoon face confidence, [0, 1] |
Response example
1{
2 "result_num": 1,
3 "result": [
4 {
5 "location": {
6 "left": 90,
7 "top": 92,
8 "width": 111,
9 "height": 99
10 },
11 "face_probability": 1,
12 "rotation_angle": 6,
13 "yaw": 11.61234664917,
14 "pitch": -0.30852827429771,
15 "roll": 8.8044967651367,
16 "landmark": [
17 {
18 "x": 105,
19 "y": 110
20 },
21 ...
22 ],
23 "landmark72": [
24 {
25 "x": 88,
26 "y": 109
27 },
28 ...
29 ],
30 "gender": "male",
31 "gender_probability": 0.99358034133911,
32 "glasses": 0,
33 "glasses_probability": 0.99991309642792,
34 "race": "yellow",
35 "race_probability": 0.99960690736771,
36 "qualities": {
37 "occlusion": {
38 "left_eye": 0.000085282314103097,
39 "right_eye": 0.00001094374601962,
40 "nose": 3.2677664307812e-7,
41 "mouth": 2.6582130940866e-10,
42 "left_cheek": 8.752236624332e-8,
43 "right_cheek": 1.0212766454742e-7,
44 "chin": 4.2632994357028e-10
45 },
46 "blur": 4.5613666312237e-41,
47 "illumination": 0,
48 "completeness": 0,
49 "type": {
50 "human": 0.98398965597153,
51 "cartoon": 0.016010366380215
52 }
53 }
54 }
55 ],
56 "log_id": 2418894422
57}
antiporn
Function
Pornography detection.
Request parameters
None
Response parameters
| Field | Types | Required or not | Description |
|---|---|---|---|
| confidence_coefficient | string | Yes | The results are classified into two categories: "Definite" and "Indefinite.\ |
| result_num | uint32 | Yes | The number of response results corresponds to the number of elements in the result array. |
| result | array(array(double)) | Yes | The results array contains items, each representing an outcome from a classification dimension. |
| conclusion | string | Yes | The final evaluation result for this image is categorized into three types: "Pornography," "Sexy," and "Normal.\ |
| log_id | uint64 | Yes | Request identifier: a unique, randomly generated number. |
Each element contains the following fields:
| Field | Types | Required or not | Description | Example |
|---|---|---|---|---|
| class_name | string | Yes | Classification result name | Pornography |
| probability | double | Yes | Classification result confidence | 0.89471650123596 |
Response example
1{
2 "result": [{
3 "probability": 0.000301,
4 "class_name": "Pornography"
5 },
6 {
7 "probability": 0.000054,
8 "class_name": "Sexy"
9 },
10 {
11 "probability": 0.999645,
12 "class_name": "Normal"
13 }],
14 "conclusion": "Normal",
15 "log_id": 848999404,
16 "confidence_coefficient": "Definite",
17 "result_num": 3
18}
politician
Function
Politician identification.
Request parameters
None
Response parameters
| Parameters | Subparameter | Required | Types | Required or not | Description |
|---|---|---|---|---|---|
| log_id | - | - | uint64 | Yes | Log ID |
| result_num | - | - | uint32 | Yes | Actual number of faces detected (not exceeding max_face_num) |
| result | - | - | object[] | Yes | |
| - | location | - | object | Yes | Position of face in the input image |
| - | - | left | uint32 | Yes | Distance from face region to the left boundary |
| - | - | top | uint32 | Yes | Distance from face region to the upper boundary |
| - | - | width | uint32 | Yes | Width of face region |
| - | - | height | uint32 | Yes | Height of face region |
| - | stars | - | object[] | Yes | Public figure array |
| - | - | name | string | Yes | Name |
| - | - | star_id | string | Yes | Figure ID, globally unique |
| - | - | probability | float | Yes | Similarity, [0, 1] |
Response example
1{
2 "log_id": 3268660173,
3 "result_num": 1,
4 "result": [
5 {
6 "location": {
7 "left": 132,
8 "top": 168,
9 "width": 238,
10 "height": 223
11 },
12 "stars": [
13 {
14 "name": "Zhang San",
15 "star_id": "515617",
16 "probability": 0.9750030040741
17 }
18 ]
19 }
20 ]
21}
terror
Function
Violent terrorism recognition
Request parameters
None
Response parameters
| Field | Types | Required or not | Description |
|---|---|---|---|
| result | array(array(double)) | Yes | Violent terrorism confidence score. |
| log_id | uint64 | Yes | Request identifier, a random and unique number |
| result_coarse | object[] | Yes | Coarse-grained score result |
| name | string | Yes | Coarse-grained tags, including two tags: normal and violent terrorism |
| score | float | Yes | Confidence score of the corresponding tag. The higher score, the higher confidence. |
| result_fine | object[ ] | Yes | Fine-grained score result |
| name | string | Yes | Fine-grained tags, including 9 tags: normal, police force, bloody, corpse, explosion and fire, murder, riot, terrorists, and military weapons |
| score | float | Yes | Confidence score of the corresponding tag. The higher score, the higher confidence. |
Response example
1{
2 "errno": 0,
3 "msg": "success",
4 "data": {
5 "result": 0.0082325544208288,
6 "result_coarse": [
7 {
8 "name": "Normal",
9 "score": 0.99176746606827
10 },
11 {
12 "name": "Violent Terrorism",
13 "score": 0.0082325544208288
14 }
15 ],
16 "result_fine": [
17 {
18 "name": "Normal",
19 "score": 0.98908758163452
20 },
21 {
22 "name": "Police Force",
23 "score": 0.0062405453063548
24 },
25 {
26 "name": "Bloody",
27 "score": 0.0009653537417762
28 },
29 {
30 "name": "Corpse",
31 "score": 0.001054480439052
32 },
33 {
34 "name": "Explosion and Fire",
35 "score": 0.00011743687355192
36 },
37 {
38 "name": "Murder",
39 "score": 0.0011699661845341
40 },
41 {
42 "name": "Riot",
43 "score": 0.000021190358893364
44 },
45 {
46 "name": "Terrorist",
47 "score": 0.0010401027975604
48 },
49 {
50 "name": "Military Weapons",
51 "score": 0.00030337597127073
52 }
53 ]
54 }
55}
public
Function
Public figure recognition
Request parameters
| Parameters | Required or not | Types | Description |
|---|---|---|---|
| max_face_num | No | uint32 | Maximum number of faces processed, defaulting to 1, maximum 5 |
| max_star_num | No | uint32 | Maximum number of similar celebrities per face, defaulting to 4 |
Response parameters
| Parameters | Subparameter | Subparameter | Types | Required | Description |
|---|---|---|---|---|---|
| log_id | - | - | uint64 | Yes | Log ID |
| result_num | - | - | uint32 | Yes | Actual number of faces detected (not exceeding max_face_num) |
| result | - | - | object[] | Yes | |
| - | location | - | object | Yes | Position of face in the input image |
| - | - | left | uint32 | Yes | Distance from face region to the left boundary |
| - | - | top | uint32 | Yes | Distance from face region to the upper boundary |
| - | - | width | uint32 | Yes | Width of face region |
| - | - | height | uint32 | Yes | Height of face region |
| - | stars | - | object[] | Yes | Public figure array |
| - | - | name | string | Yes | Name |
| - | - | star_id | string | Yes | Figure ID, globally unique |
| - | - | probability | float | Yes | Similarity, [0, 1] |
Response example
1{
2 "log_id": 3268660173,
3 "result_num": 1,
4 "result": [
5 {
6 "location": {
7 "left": 132,
8 "top": 168,
9 "width": 238,
10 "height": 223
11 },
12 "stars": [
13 {
14 "name": "Zhang San",
15 "star_id": "515617",
16 "probability": 0.9750030040741
17 }
18 ]
19 }
20 ]
21}
disgust
Function
Disgusting image recognition
Request parameters
None
Response parameters
| Parameters | Types | Required | Description |
|---|---|---|---|
| log_id | uint64 | Yes | Request identifier, a random and unique number |
| result | double | Yes | Score |
Response example
1{
2 "result": 9.2708455667889E-7,
3 "log_id": 2977989308
4}
watermark
Function
Watermark detection
Request parameters
None
Response Value
| Parameters | Types | Required | Description |
|---|---|---|---|
| log_id | uint64 | Yes | Request identifier, a random and unique number |
| result_num | uint32 | No | Number of response results, i.e., the number of elements in the result array |
| result | array(object) | No | Response result array, each item as a detected result |
Each item in the result contains the following fields:
| Parameters | Types | Required | Description |
|---|---|---|---|
| location | object | No | Position information (pixel position from left, pixel position from top, pixel width, pixel height) |
| probability | double | Yes | Classification result confidence: [0-1.0] |
| type | string | Yes | Type of response result (watermark, bar code, QR code) |
Response example
1{
2 "result": [{
3 "probability": 0.99872654676437,
4 "type": "watermark"
5 },
6 {
7 "probability": 0.98578763008118,
8 "type": "watermark"
9 }],
10 "log_id": 686882979,
11 "result_num": 2
12}
quality
Function
Image quality
Request parameters
None
Response parameters
| Parameters | Subparameter | Types | Required | Description |
|---|---|---|---|---|
| log_id | - | uint64 | Yes | Log ID |
| result | - | object[] | Yes | |
| - | aesthetic | double | Yes | Aesthetics |
| - | clarity | double | Yes | Clarity |
Response example
1{
2 "result": {
3 "aesthetic": 0.26410301526388,
4 "clarity": 0.28039423624674
5 },
6 "log_id": 93316007
7}
