Image audit

Updated at：2025-11-03

Overview

The image review service is an intelligent review solution from Baidu AI Cloud, offering multidimensional review capabilities, including pornography detection, violent terrorism recognition, political sensitivity analysis, disgusting content identification, and advertisement recognition. It is applicable in various scenarios, such as gaming, social platforms, forums, lifestyle services, and UGC (user-generated content). For example, game developers can use this service to review custom avatars uploaded by players.

The image review service is seamlessly integrated with BOS, allowing you to access its capabilities through BOS API, SDK, CLI, and other methods.

Charge type

As the call entry of the image review service, BOS only charges for BOS API requests. For details, please refer to [BOS Product Pricing](BOS/Product pricing/Product price/Pay-As-You-Go Charge Type.md).

The fee for the image review service is charged by the image review product. For details, please refer to Image Review Product Documentation.

Enable the image review service

Currently, Baidu AI Cloud supports reviewing images on BOS, including actively calling the image review API and automatic image review. For the automatic image review function, you need to first enable the image review service through the console. After the image review rules take effect, the files you newly upload will be automatically reviewed. You can upload objects to BOS through [console](BOS/Console Operation Guide/Manage object/Upload files.md), [SDK](BOS/Developer Guide/Object Basic Operations/Uploading Data/Simple upload.md), CLI and other methods.

To enable the image review service on the console, please refer to Setting Up Image Review Service on Console.
After enabling the image review service, users can call the image review function through the BOS CLI Tool or BOS API. The following describes the API in detail.

API

This API enables access to the image review service.

Note: To use this API, you need to enable the image review service on the console first.

Request

Request syntax

Plain Text

1POST <ObjectName>?process HTTP/1.1
2Host: <BucketName>.bj.bcebos.com
3Date: <Date>
4Authorization: <AuthorizationString>
5Content-Type: application/json; charset=utf-8
6Content-Length: <ContentLength>
7
8{
9	"action" : {
10		"sync" : [{
11				"url" : "$(img-censor)",
12				"parameters" : "<base64_encode(param)>",
13
14			}
15		]
16	}
17}

Request parameters
- url: The fixed value is $(img-censor), which should not be modified.
- Parameters: These are defined by the base64-encoded image review function parameters, which are represented as a JSON string.
  
  Basic structure of image review function parameters:
  Plain Text
  1 { 2 "antiporn" : {}, 3 "terror" : {}, 4 "ocr" : { 5 "detect_direction" : "false", 6 "language_type" : "CHN_ENG", 7 "recognize_granularity" : "big" 8 } 9 }
  May be an arbitrary combination of multiple sub-services as needed, or a single sub-service such as "pornographic content detection", in which case the parameter is written as
  Plain Text
  1{ 2 "antiporn" : {} 3}
  The list of sub-services supported by the image review service includes:
  - ocr: General optical character recognition
  - face: Face detection
  - antiporn: Pornography recognition
  - politician: Politician recognition
  - terror: Violent terrorism recognition
  - public: Public figure recognition
  - disgust: Disgusting image recognition
  - watermark: Watermark QR code recognition
  - Quality: Image quality recognition values correspond to underlying service parameters. Refer to the input parameters of the Hetu general optical character recognition (OCR) service.
  Detailed explanations of request and response parameters for sub-services are provided below.
Request headers

None

Response

Response headers None
Response element

Refer to the explanations for response parameters of each sub-service.

Example

Request example

Plain Text

1{
2    "antiporn": {},
3    "ocr": {
4        "detect_direction": "false",
5        "language_type": "CHN_ENG",
6        "recognize_granularity": "big"
7    }
8}

After base64 encoding

Plain Text

1eyJhbnRpcG9ybiI6e30sIm9jciI6eyJkZXRlY3RfZGlyZWN0aW9uIjoiZmFsc2UiLCJsYW5ndWFnZV90eXBlIjoiQ0hOX0VORyIsInJlY29nbml6ZV9ncmFudWxhcml0eSI6ImJpZyJ9fQ==

Fill into parameters. The requests sent are as follows:

Plain Text

1POST <ObjectName>?process HTTP/1.1
2Host: <BucketName>.bj.bcebos.com
3Date: <Date>
4Authorization: <AuthorizationString>
5Content-Type: application/json; charset=utf-8
6Content-Length: <ContentLength>
7
8{
9	"action" : {
10		"sync" : [{
11				"url" : "$(img-censor)",
12				"parameters" : "eyJhbnRpcG9ybiI6e30sIm9jciI6eyJkZXRlY3RfZGlyZWN0aW9uIjoiZmFsc2UiLCJsYW5ndWFnZV90eXBlIjoiQ0hOX0VORyIsInJlY29nbml6ZV9ncmFudWxhcml0eSI6ImJpZyJ9fQ==",
13			}
14		]
15	}
16}

Successful response example

Plain Text

1HTTP/1.1 200 OK
2Date: Thu, 22 Jun 2017 07:30:56 GMT
3Content-Type: application/json; charset=utf-8
4Content-Length: 237
5Connection: keep-alive
6Server: BceBos
7x-bce-debug-id: MTAuNzUuNzguNDA6VGh1LCAyMiBKdW4gMjAxNyAxNTozMDo1NiBDU1Q6MTg1MDY5NDg3OQ==
8x-bce-request-id: 598f7e18-77fb-424a-bc68-95acb0644076
9{
10	"result" : {
11		"antiporn" : {
12			"result" : [{
13					"probability" : 0.000071,
14 "class_name": "Pornography"
15				}, {
16					"probability" : 0.000291,
17 "class_name": "Sexy"
18				}, {
19					"probability" : 0.999638,
20 "class_name": "Normal"
21				}
22			],
23			"log_id" : 1853911322,
24			"result_num" : 3
25		},
26		"ocr" : {
27			"log_id" : 2471272194,
28			"words_result_num" : 2,
29			"words_result" : [{
30					"words" : " TSINGTAO"
31				}, {
32 "words": "Qingdao"
33				}
34			]
35		}
36	},
37	"log_id" : 149811665151162
38}

Failed response example

Plain Text

1{
2    "log_id": 149319909347709,
3    "error_code": 216500,
4    "error_msg": "unknown error"
5}

Explanation on failed error_code:

Error code	Error message	Description
216101	not enough param	Insufficient parameters
216102	service not support	An unsupported underlying service type is entered
216200	empty imge	No image URL
216500	unknown error	Unknown error
282804	download image error	Image download failed
282000	logic internal error	Service logic layer error

ocr

Function

The user sends a request for the service to recognize all characters within an image.

Request parameters

Parameters	Required or not	Types	Option range	Description
language_type	No	string	CHN_ENG、ENG、POR、FRE、GER、ITA、SPA、RUS、JAP、KOR	Recognition language type defaults to CHN_ENG. Available options include: - CHN_ENG: Chinese-English mixed; - ENG: English; - POR: Portuguese; - FRE: French; - GER: German; - ITA: Italian; - SPA: Spanish; - RUS: Russian; - JAP: Japanese; - KOR: Korean.
detect_direction	No	boolean	true、false	Specifies whether to detect image orientation, with the default set to no detection (false). Orientation refers to whether the input image is in its normal position or rotated 90/180/270 degrees counterclockwise. Available options include: - true: detect orientation; - false: do not detect orientation.
detect_language	No	string	true、false	Specifies whether language detection is required, with no detection set as the default. Currently supports Chinese, English, Japanese, and Korean.
probability	No	string	true、false	Whether to return the confidence level for each row in the recognition results

Response parameters

Field	Required or not	Types	Description
direction	No	int32	Image orientation is included when detect_direction=true. Possible values are: -1: Undefined, - 0: Upright, - 1: Rotated 90 degrees counterclockwise, - 2: Rotated 180 degrees counterclockwise, - 3: Rotated 270 degrees counterclockwise.
log_id	Yes	uint64	A unique log ID for issue localization
words_result	Yes	array()	Recognition result array
words_result_num	Yes	uint32	Number of recognition results, indicating the number of elements in words_result
+words	No	string	Recognition result string
probability	No	object	Confidence value for each row in the recognition result, including average: average row confidence, variance: row confidence variance, min: minimum row confidence

Response example

Plain Text

1{
2"log_id": 2471272194,
3"words_result_num": 2,
4"words_result":
5    [
6        {"words": " TSINGTAO"},
7 {"words": "Tsingtao Beer"}
8    ]
9}

face

Function

Detects faces in the input image and provides face locations, 72 key point coordinates, and face-related attribute information.
The detection response time depends on the number of faces in the image. A larger number of faces may slightly slow down the response.
Typical application scenarios: e.g., Facial Attribute Analysis, Processing Analysis Based on Key Face Points, Facial Marketing Campaigns, etc.
Specific coordinates are provided for the five sense organs, while the 72 key point coordinates include exact positions but exclude detailed descriptions of their locations.

Request parameters

None

Response parameters

Parameters	Types	Required or not	Description
log_id	uint64	Yes	Log ID
result_num	uint32	Yes	Face count
result	object[]	Yes	Collection of facial attribute objects
+age	double	No	Age information is returned if face_fields includes age.
+beauty	double	No	Beauty score ranges from 0 to 100, with higher scores indicating greater beauty. This value is returned when face_fields includes beauty.
+location	object	Yes	Position of face in the image
++left	uint32	Yes	Distance from face region to the left boundary
++top	uint32	Yes	Distance from face region to the upper boundary
++width	uint32	Yes	Width of face region
++height	uint32	Yes	Height of face region
+face_probability	double	Yes	Face confidence, ranging 0-1
+rotation_angle	int32	Yes	Clockwise rotation angle of face frame relative to vertical direction, [-180,180]
+yaw	double	Yes	3D rotation - left/right rotation angle: [-90 (left), 90(right)]
+pitch	double	Yes	3D rotation - pitch angle [-90 (Upper), 90 (Lower)]
+roll	double	Yes	In-plane rotation angle [-180 (counterclockwise), 180(clockwise)]
+expression	uint32	No	Expression: 0 for no smile or laugh, 1 for smile, and 2 for laugh. This is returned when face_fields includes expression.
+expression_probability	double	No	Expression confidence ranges from 0 to 1 and is provided when face_fields includes expression.
+faceshape	object[]	No	Face shape confidence is returned if face_fields includes faceshape.
++type	string	Yes	Face shape: square/triangle/oval/heart/round
++probability	double	Yes	Confidence: 0~1
+gender	string	No	Gender (male or female) is returned when face_fields includes gender.
+gender_probability	double	No	Gender confidence ranges from 0 to 1 and is returned if face_fields includes gender.
+glasses	uint32	No	Glasses: 0 for no glasses, 1 for ordinary glasses, and 2 for sunglasses. This is returned when face_fields includes glasses.
+glasses_probability	double	No	Glasses confidence ranges from 0 to 1 and is included if face_fields includes glasses.
+landmark	object[]	No	Key point positions for the left eye center, right eye center, nose tip, and mouth center are provided when face_fields includes landmark.
++x	uint32	No	X-coordinate
++y	uint32	No	Y-coordinate
+landmark72	object[]	No	Seventy-two feature point positions are illustrated and returned when face_fields includes landmark.
++x	uint32	No	X-coordinate
++y	uint32	No	Y-coordinate
+race	string	No	Race options include yellow, white, black, and Arab, and are returned when face_fields includes race.
+race_probability	double	No	Race confidence ranges from 0 to 1 and is returned if face_fields includes race.
+qualities	object	No	Face quality information is included if face_fields includes qualities.
++occlusion	object	Yes	Probability of occlusion for facial parts, [0, 1] (pending launch)
+++left_eye	double	Yes	Left eye
+++right_eye	double	Yes	Right eye
+++nose	double	Yes	Nose
+++mouth	double	Yes	Mouth
+++left_cheek	double	Yes	Left cheek
+++right_cheek	double	Yes	Right cheek
+++chin	double	Yes	Chin
++blur	double	Yes	Face blurriness, [0, 1]. 0 indicates clear; 1 indicates blurry (pending launch)
++illumination	-	Yes	The value range is [0,255], indicating the lighting level of the facial area (pending launch)
++completeness	-	Yes	Face completeness, [0, 1]. 0 indicates complete; 1 indicates incomplete (pending launch)
++type	object	Yes	Real face/cartoon face confidence
+++human	-	Yes	Real face confidence, [0, 1]
+++cartoon	-	Yes	Cartoon face confidence, [0, 1]

Response example

Plain Text

1{
2    "result_num": 1,
3    "result": [
4        {
5            "location": {
6                "left": 90,
7                "top": 92,
8                "width": 111,
9                "height": 99
10            },
11            "face_probability": 1,
12            "rotation_angle": 6,
13            "yaw": 11.61234664917,
14            "pitch": -0.30852827429771,
15            "roll": 8.8044967651367,
16            "landmark": [
17                {
18                    "x": 105,
19                    "y": 110
20                },
21              ...
22            ],
23            "landmark72": [
24                {
25                    "x": 88,
26                    "y": 109
27                },
28               ...
29            ],
30            "gender": "male",
31            "gender_probability": 0.99358034133911,
32            "glasses": 0,
33            "glasses_probability": 0.99991309642792,
34            "race": "yellow",
35            "race_probability": 0.99960690736771,
36            "qualities": {
37                "occlusion": {
38                    "left_eye": 0.000085282314103097,
39                    "right_eye": 0.00001094374601962,
40                    "nose": 3.2677664307812e-7,
41                    "mouth": 2.6582130940866e-10,
42                    "left_cheek": 8.752236624332e-8,
43                    "right_cheek": 1.0212766454742e-7,
44                    "chin": 4.2632994357028e-10
45                },
46                "blur": 4.5613666312237e-41,
47                "illumination": 0,
48                "completeness": 0,
49                "type": {
50                    "human": 0.98398965597153,
51                    "cartoon": 0.016010366380215
52                }
53            }
54        }
55    ],
56    "log_id": 2418894422
57}

antiporn

Function

Pornography detection.

Request parameters

None

Response parameters

Field	Types	Required or not	Description
confidence_coefficient	string	Yes	The results are classified into two categories: "Definite" and "Indefinite.\
result_num	uint32	Yes	The number of response results corresponds to the number of elements in the result array.
result	array(array(double))	Yes	The results array contains items, each representing an outcome from a classification dimension.
conclusion	string	Yes	The final evaluation result for this image is categorized into three types: "Pornography," "Sexy," and "Normal.\
log_id	uint64	Yes	Request identifier: a unique, randomly generated number.

Each element contains the following fields:

Field	Types	Required or not	Description	Example
class_name	string	Yes	Classification result name	Pornography
probability	double	Yes	Classification result confidence	0.89471650123596

Response example

Plain Text

1{
2	"result": [{
3		"probability": 0.000301,
4 "class_name": "Pornography"
5	},
6	{
7		"probability": 0.000054,
8 "class_name": "Sexy"
9	},
10	{
11		"probability": 0.999645,
12 "class_name": "Normal"
13	}],
14 "conclusion": "Normal",
15	"log_id": 848999404,
16 "confidence_coefficient": "Definite",
17	"result_num": 3
18}

politician

Function

Politician identification.

Request parameters

None

Response parameters

Parameters	Subparameter	Required	Types	Required or not	Description
log_id	-	-	uint64	Yes	Log ID
result_num	-	-	uint32	Yes	Actual number of faces detected (not exceeding max_face_num)
result	-	-	object[]	Yes
-	location	-	object	Yes	Position of face in the input image
-	-	left	uint32	Yes	Distance from face region to the left boundary
-	-	top	uint32	Yes	Distance from face region to the upper boundary
-	-	width	uint32	Yes	Width of face region
-	-	height	uint32	Yes	Height of face region
-	stars	-	object[]	Yes	Public figure array
-	-	name	string	Yes	Name
-	-	star_id	string	Yes	Figure ID, globally unique
-	-	probability	float	Yes	Similarity, [0, 1]

Response example

Plain Text

1{
2    "log_id": 3268660173,
3    "result_num": 1,
4    "result": [
5        {
6            "location": {
7                "left": 132,
8                "top": 168,
9                "width": 238,
10                "height": 223
11            },
12            "stars": [
13                {
14 "name": "Zhang San",
15                    "star_id": "515617",
16                    "probability": 0.9750030040741
17                }
18            ]
19        }
20    ]
21}

terror

Function

Violent terrorism recognition

Request parameters

None

Response parameters

Field	Types	Required or not	Description
result	array(array(double))	Yes	Violent terrorism confidence score.
log_id	uint64	Yes	Request identifier, a random and unique number
result_coarse	object[]	Yes	Coarse-grained score result
name	string	Yes	Coarse-grained tags, including two tags: normal and violent terrorism
score	float	Yes	Confidence score of the corresponding tag. The higher score, the higher confidence.
result_fine	object[ ]	Yes	Fine-grained score result
name	string	Yes	Fine-grained tags, including 9 tags: normal, police force, bloody, corpse, explosion and fire, murder, riot, terrorists, and military weapons
score	float	Yes	Confidence score of the corresponding tag. The higher score, the higher confidence.

Response example

Plain Text

1{
2    "errno": 0,
3    "msg": "success",
4    "data": {
5        "result": 0.0082325544208288,
6        "result_coarse": [
7            {
8 "name": "Normal",
9                "score": 0.99176746606827
10            },
11            {
12 "name": "Violent Terrorism",
13                "score": 0.0082325544208288
14            }
15        ],
16        "result_fine": [
17            {
18 "name": "Normal",
19                "score": 0.98908758163452
20            },
21            {
22 "name": "Police Force",
23                "score": 0.0062405453063548
24            },
25            {
26 "name": "Bloody",
27                "score": 0.0009653537417762
28            },
29            {
30 "name": "Corpse",
31                "score": 0.001054480439052
32            },
33            {
34 "name": "Explosion and Fire",
35                "score": 0.00011743687355192
36            },
37            {
38 "name": "Murder",
39                "score": 0.0011699661845341
40            },
41            {
42 "name": "Riot",
43                "score": 0.000021190358893364
44            },
45            {
46 "name": "Terrorist",
47                "score": 0.0010401027975604
48            },
49            {
50 "name": "Military Weapons",
51                "score": 0.00030337597127073
52            }
53        ]
54    }
55}

public

Function

Public figure recognition

Request parameters

Parameters	Required or not	Types	Description
max_face_num	No	uint32	Maximum number of faces processed, defaulting to 1, maximum 5
max_star_num	No	uint32	Maximum number of similar celebrities per face, defaulting to 4

Response parameters

Parameters	Subparameter	Subparameter	Types	Required	Description
log_id	-	-	uint64	Yes	Log ID
result_num	-	-	uint32	Yes	Actual number of faces detected (not exceeding max_face_num)
result	-	-	object[]	Yes
-	location	-	object	Yes	Position of face in the input image
-	-	left	uint32	Yes	Distance from face region to the left boundary
-	-	top	uint32	Yes	Distance from face region to the upper boundary
-	-	width	uint32	Yes	Width of face region
-	-	height	uint32	Yes	Height of face region
-	stars	-	object[]	Yes	Public figure array
-	-	name	string	Yes	Name
-	-	star_id	string	Yes	Figure ID, globally unique
-	-	probability	float	Yes	Similarity, [0, 1]

Response example

Plain Text

1{
2    "log_id": 3268660173,
3    "result_num": 1,
4    "result": [
5        {
6            "location": {
7                "left": 132,
8                "top": 168,
9                "width": 238,
10                "height": 223
11            },
12            "stars": [
13                {
14 "name": "Zhang San",
15                    "star_id": "515617",
16                    "probability": 0.9750030040741
17                }
18            ]
19        }
20    ]
21}

disgust

Function

Disgusting image recognition

Request parameters

None

Response parameters

Parameters	Types	Required	Description
log_id	uint64	Yes	Request identifier, a random and unique number
result	double	Yes	Score

Response example

Plain Text

1{
2	"result": 9.2708455667889E-7,
3	"log_id": 2977989308
4}

watermark

Function

Watermark detection

Request parameters

None

Response Value

Parameters	Types	Required	Description
log_id	uint64	Yes	Request identifier, a random and unique number
result_num	uint32	No	Number of response results, i.e., the number of elements in the result array
result	array(object)	No	Response result array, each item as a detected result

Each item in the result contains the following fields:

Parameters	Types	Required	Description
location	object	No	Position information (pixel position from left, pixel position from top, pixel width, pixel height)
probability	double	Yes	Classification result confidence: [0-1.0]
type	string	Yes	Type of response result (watermark, bar code, QR code)

Response example

Plain Text

1{
2	"result": [{
3		"probability": 0.99872654676437,
4		"type": "watermark"
5	},
6	{
7		"probability": 0.98578763008118,
8		"type": "watermark"
9	}],
10	"log_id": 686882979,
11	"result_num": 2
12}

quality

Function

Image quality

Request parameters

None

Response parameters

Parameters	Subparameter	Types	Required	Description
log_id	-	uint64	Yes	Log ID
result	-	object[]	Yes
-	aesthetic	double	Yes	Aesthetics
-	clarity	double	Yes	Clarity

Response example

Plain Text

1{
2	"result": {
3		"aesthetic": 0.26410301526388,
4		"clarity": 0.28039423624674
5	},
6	"log_id": 93316007
7}

Single-link rate limit

Region and Endpoint