Photo Review

Last Updated：2020-10-21

Overview

Photo audit service is an photo intelligent review service provided by Baidu AI Cloud. It supports review of multiple dimensions of photo, including pornographic recognition, violent-terrorist recognition, politically sensitive recognition, disgust photo recognition and advertising recognition. photo review application scenarios have a wide range, covering games, social networking, forums, life services, and UGC. If needed, game makers may review the avatar uploaded by the player with the assistance of photo review service.

With the in-depth integration of photo review service and BOS, you can call the photo review capability through BOS API, SDK, and CLI.

Charging Mode

As the call entry of photo review service, BOS only charges the BOS API request fee. For details, please refer to BOS Pricing.

The fee for photo review service is charged through the photo review product.

Subscribing Photo Review Service

At present, Baidu AI Cloud supports the review of photo on BOS. You can upload an object to BOS through Console, SDK, and CLI.

1.To subscribe the photo review service, you need to choose "photo review" in the "data processing" module on the global overview page and click "use right now".

After the subscription, the user can call the photo review function through BOS CLI Tools or a BOS API interface. The details of API interface are described below.

API Interface

This interface is used to request photo audit service.

Note: You can use this interface only if you have subscribed the photo review service in the console.

Request

Request syntax

POST <ObjectName>?process HTTP/1.1
Host: <BucketName>.bj.bcebos.com
Date: <Date>
Authorization: <AuthorizationString>
Content-Type: application/json; charset=utf-8
Content-Length: <ContentLength>

{
	"action" : {
		"sync" : [{
				"url" : "$(img-censor)",
				"parameters" : "<base64_encode(param)>",
			
			}
		]
	}
}

Request parameter
- url The fixed value is $(img-censor), which does not need to be modified.
- parameters This value is base64 encoded for photo audit functional parameter. The photo audit functional parameter is still a json string.
Basic structure of photo audit functional parameter:
```
 { 
   "antiporn" : {}, 
   "terror" : {}, 
   "ocr" : { 
       "detect_direction" : "false", 
       "language_type" : "CHN_ENG",
       "recognize_granularity" : "big" 
   } 
 } 
```
It can be any combination of multiple sub-services on demand, or a single sub-service such as "porn identification". At this time, the parameter is written as
```
{ 
  "antiporn" : {} 
} 
```

The lists of sub-service supported by the photo audit service are:

ocr Universal text recognition
face Face detection
antiporn Pornographic recognition
politician Political figure recognition
terror Violent-terrorist recognition -public public figure recognition -disgust disgust photo recognition
watermark Watermark QR code recognition
quality The value of photo quality recognition is the underlying service parameter. See the input parameter of Hetu OCR Universal Text Recognition Service.

For detailed explanations of sub-service request parameter and return parameter, see the detailed explanation below.
Request header

None

Response

Response header fields None
Response element

See the return parameter explanation of each sub-service.

Example

Request example

{ 
    "antiporn": {}, 
    "ocr": { 
        "detect_direction": "false", 
        "language_type": "CHN_ENG",
        "recognize_granularity": "big" 
    } 
}

After base64 encoding,

eyJhbnRpcG9ybiI6e30sIm9jciI6eyJkZXRlY3RfZGlyZWN0aW9uIjoiZmFsc2UiLCJsYW5ndWFnZV90eXBlIjoiQ0hOX0VORyIsInJlY29nbml6ZV9ncmFudWxhcml0eSI6ImJpZyJ9fQ==

It is filled in parameters, the request sent is as follows:

POST <ObjectName>?process HTTP/1.1
Host: <BucketName>.bj.bcebos.com
Date: <Date>
Authorization: <AuthorizationString>
Content-Type: application/json; charset=utf-8
Content-Length: <ContentLength>

{
	"action" : {
		"sync" : [{
				"url" : "$(img-censor)",
				"parameters" : "eyJhbnRpcG9ybiI6e30sIm9jciI6eyJkZXRlY3RfZGlyZWN0aW9uIjoiZmFsc2UiLCJsYW5ndWFnZV90eXBlIjoiQ0hOX0VORyIsInJlY29nbml6ZV9ncmFudWxhcml0eSI6ImJpZyJ9fQ==",
			}
		]
	}
}

Successful response example

HTTP/1.1 200 OK
Date: Thu, 22 Jun 2017 07:30:56 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 237
Connection: keep-alive
Server: BceBos
x-bce-debug-id: MTAuNzUuNzguNDA6VGh1LCAyMiBKdW4gMjAxNyAxNTozMDo1NiBDU1Q6MTg1MDY5NDg3OQ==
x-bce-request-id: 598f7e18-77fb-424a-bc68-95acb0644076

{
	"result" : {
		"antiporn" : {
			"result" : [{
					"probability" : 0.000071,
					"class_name" : "Pornographic"
				}, {
					"probability" : 0.000291,
					"class_name" : "Sexy"
				}, {
					"probability" : 0.999638,
					"class_name" : "Normal"
				}
			],
			"log_id" : 1853911322,
			"result_num" : 3
		},
		"ocr" : {
			"log_id" : 2471272194,
			"words_result_num" : 2,
			"words_result" : [{
					"words" : " TSINGTAO"
				}, {
					"words" : "Qingdao"
				}
			]
		}
	},
	"log_id" : 149811665151162
}

Failure return example

{ 
    "log_id": 149319909347709,
    "error_code": 216500,
    "error_msg": "unknown error" 
}

Failure error_code explanation:

Error code	Error message	Description
216101	not enough param	Insufficient parameters
216102	service not support	An unsupported underlying service type was entered
216200	empty photo	No photo Url
216500	unknown error	unknown error
282804	download photo error	download photo error
282000	logic internal error	logic internal error

Ocr

Feature

The user requests a service to identify all text in an photo.

Request Parameter

Parameter	Is it required	Type	Optional value range	Description
language_type	No	string	CHN_ENG, ENG, POR, FRE, GER, ITA, SPA, RUS, JAP, KOR	Recognition language type is CHN_ENG by default. Optional values are: - CHN_ENG Mixed Chinese and English; -ENG: English; -POR: Portuguese; -FRE: French; -GER: German; -ITA: Italian; -SPA: Spanish; -RUS: Russian; -JAP: Japanese;-KOR: Korean
detect_direction	No	boolean	true、false	Whether to detect the orientation of the image, which is not detected by default, that is: false. Orientation means that the input image is normal and rotated 90/180/270 degrees counterclockwise. Optional values include:-true: Detect orientation;-false: No orientation is detected.
detect_language	No	string	true-false	Whether to detect language, which is not detected by default. Currently support (Chinese, English, Japanese, Korean)
probability	No	string	true-false	Whether to return confidence of each row in recognition result

Return parameter

Field	Is it required	Type	Description
direction	No	int32	Image orientation exists when detect_direction = true. --1: undefined, -0: forward, -1: 90 degrees counterclockwise, -2: 180 degrees counterclockwise, -3: 270 degrees counterclockwise
log_id	Yes	uint64	Unique log id is used for problem location
words_result	Yes	array()	Recognition result array
words_result_num	Yes	uint32	Number of recognition result, indicating the number of elements of words_result
+words	No	string	Recognition result string
probability	No	object	Confidence value of each row in recognition result, including average: Row confidence average, variance: Row confidence variance, min: Row confidence minimum

Return Example

{ 
"log_id": 2471272194, 
"words_result_num": 2,
"words_result": 
    [ 
        {"words": " TSINGTAO"}, 
        {"words": "Tsingtao Beer"} 
    ] 
}

Face

Feature

Detect the face in the requested photo, and return the face position, 72 key point coordinates and face association attributes information.
Detection response speed is related to the number of faces in the photo. The response time will be slightly longer when the number of faces is large.
Typical application scenarios: Such as Face attribute analysis, Processing analysis based on key points of face, Face marketing campaign.
The positions of the facial features are marked with specific coordinates; the coordinates of the 72 key points also include specific coordinates, but do not include detailed descriptions of corresponding positions.

Request Parameter

None

Return Parameter

Parameter	Type	Required or not	Description
log_id	uint64	Yes	Log id
result_num	uint32	Yes	Number of faces
result	object[]	Yes	Collection of face attribute objects
+age	double	No	age. Return when face_fields includes age
+beauty	double	No	Beauty and ugliness scoring, the range is 0-100, the bigger the more beautiful. Return when face_fields includes beauty
+location	object	Yes	Face position in the image
++left	uint32	Yes	Distance of face region from left border
++top	uint32	Yes	Distance of face region from upper border
++width	uint32	Yes	Face region width
++height	uint32	Yes	Face region height
+face_probability	double	Yes	Face confidence ranges from 0 to1
+rotation_angle	int32	Yes	Clockwise rotation angle of face frame relative to vertical direction, [-180,180]
+yaw	double	Yes	Left and right rotation angle of 3D rotation [-90 (left), 90 (right)]
+pitch	double	Yes	Pitch angle of 3D rotation [-90 (up), 90 (down)]
+roll	double	Yes	Rotation angle in the plane [-180 (counterclockwise), 180 (clockwise)]
+expression	uint32	No	Face expression, 0, do not laugh; 1, smile; 2, laugh. Return when face_fields includes expression
+expression_probability	double	No	Expression confidence ranges from 0 to 1.Return when face_fields includes expression
+faceshape	object[]	No	Face confidence. Return when face_fields includes faceshape
++type	string	Yes	Face type: square/triangle/oval/heart/round
++probability	double	Yes	Confidence: 0~1
+gender	string	No	male, female. Return when face_fields includes gender
+gender_probability	double	No	Gender confidence ranges from 0 to 1.Return when face_fields includes gender
+glasses	uint32	No	Whether towear glasses, 0-without glasses, 1-ordinary glasses, 2-sunglasses. Return when face_fields includes glasses
+glasses_probability	double	No	Glasses confidence ranges from 0 to 1.Return when face_fields includes glasses
+landmark	object[]	No	4 key point locations, left eye center, right eye center, nose tip and mouth center. Return when face_fields includes landmark
++x	uint32	No	x coordinate
++y	uint32	No	y coordinate
+landmark72	object[]	No	72 feature point positions, example image . Return when face_fields includes landmark
++x	uint32	No	x coordinate
++y	uint32	No	y coordinate
+race	string	No	yellow、white、black、arabs. Return when face_fields includes race
+race_probability	double	No	Race confidence ranges from 0 to 1.Return when face_fields includes race
+qualities	object	No	Face quality information. Return when face_fields includes qualities
++occlusion	object	Yes	Probability of occlusion of each part of the face, [0, 1](to be online)
+++left_eye	double	Yes	Left eye
+++right_eye	double	Yes	Right eye
+++nose	double	Yes	Nose
+++mouth	double	Yes	Mouth
+++left_cheek	double	Yes	Left cheek
+++right_cheek	double	Yes	Right cheek
+++chin	double	Yes	Chin
++blur	double	Yes	Face blur degree, [0, 1]. 0 means clear, 1 means fuzzy (to be online)
++illumination	-	Yes	Value range is [0,255], indicating light level of face region (to be online)
++completeness	-	Yes	Face completeness, [0, 1]. 0 means complete, 1 means incomplete (to be online)
++type	object	Yes	Real face/cartoon face confidence
+++human	-	Yes	Real face confidence, [0, 1]
+++cartoon	-	Yes	Cartoon face confidence, [0, 1]

Return Example

{
    "result_num": 1,
    "result": [
        {
            "location": {
                "left": 90,
                "top": 92,
                "width": 111,
                "height": 99
            },
            "face_probability": 1,
            "rotation_angle": 6,
            "yaw": 11.61234664917,
            "pitch": -0.30852827429771,
            "roll": 8.8044967651367,
            "landmark": [
                {
                    "x": 105,
                    "y": 110
                },
              ...
            ],
            "landmark72": [
                {
                    "x": 88,
                    "y": 109
                },
               ...
            ],
            "gender": "male",
            "gender_probability": 0.99358034133911,
            "glasses": 0,
            "glasses_probability": 0.99991309642792,
            "race": "yellow",
            "race_probability": 0.99960690736771,
            "qualities": {
                "occlusion": {
                    "left_eye": 0.000085282314103097,
                    "right_eye": 0.00001094374601962,
                    "nose": 3.2677664307812e-7,
                    "mouth": 2.6582130940866e-10,
                    "left_cheek": 8.752236624332e-8,
                    "right_cheek": 1.0212766454742e-7,
                    "chin": 4.2632994357028e-10
                },
                "blur": 4.5613666312237e-41,
                "illumination": 0,
                "completeness": 0,
                "type": {
                    "human": 0.98398965597153,
                    "cartoon": 0.016010366380215
                }
            }
        }
    ],
    "log_id": 2418894422
}

Antiporn

Feature

Pornographic recognition.

Request Parameter

None

Return Parameter

Field	Type	Required or not	Description
confidence_coefficient	string	Yes	Whether the result is determined, which is divided into two types of "determined" and "undetermined".
result_num	uint32	Yes	Return the number of results, that is: Number of elements in result array.
result	array(array(double))	Yes	Result array, each content corresponds to the result of a classification dimension.
conclusion	string	Yes	The final identification result of this image is divided into three types of "pornographic", "sexy" and "normal".
log_id	uint64	Yes	Request identification code, random number, unique.

Each of these elements contains the following fields:

Field	Type	Required or not	Description	Example
class_name	string	Yes	Classification result name	Pornographic
probability	double	Yes	Classification result confidence	0.89471650123596

Return Example

{ 
	 "result": [{ 
	 	 "probability": 0.000301, 
	 	 "class_name": "Pornographic" 
	 }, 
	 { 
	 	 "probability": 0.000054, 
	 	 "class_name" : " sexy" 
	 }, 
	 { 
	 	 "probability": 0.999645, 
	 	 "class_name": "Normal" 
	 }], 
	 "conclusion": "Normal", 
	"log_id": 848999404,
	 "confidence_coefficient": "Determined", 
	"result_num": 3
}

Politician

Feature

Political figure recognition.

Request Parameter

None

Return Parameter

Parameter	Subparameter	Required	Type	Required or not	Description
log_id	-	-	uint64	Yes	Log id
result_num	-	-	uint32	Yes	Number of faces actually detected (not greater than max_face_num)
result	-	-	object[]	Yes
-	location	-	object	Yes	Face position in the input image
-	-	left	uint32	Yes	Distance of face region from left border
-	-	top	uint32	Yes	Distance of face region from upper border
-	-	width	uint32	Yes	Face region width
-	-	height	uint32	Yes	Face region height
-	stars	-	object[]	Yes	Public figure array
-	-	name	string	Yes	Name
-	-	star_id	string	Yes	Character id, globally unique
-	-	probability	float	Yes	Similarity, [0, 1]

Return Example

{ 
    "log_id": 3268660173,
    "result_num": 1,
    "result": [ 
        { 
            "location": { 
                "left": 132, 
                "top": 168, 
                "width": 238, 
                "height": 223 
            }, 
            "stars": [ 
                { 
                    "name": "Xi Jinping", 
                    "star_id": "515617",
                    "probability": 0.9750030040741 
                } 
            ] 
        } 
    ] 
}

Terror

Feature

Violent-terrorist recognition

Request Parameter

None

Return Parameter

Field	Type	Required or not	Description
result	array(array(double))	Yes	Violent-terrorist confidence score.
log_id	uint64	Yes	Request identification code, random number, unique.
result_coarse	object[]	Yes	Coarse-grained score result
name	string	Yes	Coarse-grained tags include two tags: Normal, Violent-terrorist
score	float	Yes	Confidence score of the corresponding tag, the large the score is, the higher the confidence is
result_fine	object[ ]	Yes	Fine-grained score result
name	string	Yes	Fine-grained tags include 9 tags: Normal, police force, bloodiness, dead body, explosion fire, homicide, riot, violent-terrorist figure, military weapon
score	float	Yes	Confidence score of the corresponding tag, the large the score is, the higher the confidence is

Return Example

{ 
    "errno": 0, 
    "msg": "success", 
    "data": { 
        "result": 0.0082325544208288, 
        "result_coarse": [ 
            { 
                "name": "Normal", 
                "score": 0.99176746606827 
            }, 
            { 
                "name": "Violent-terrorist ", 
                "score": 0.0082325544208288 
            } 
        ], 
        "result_fine": [ 
            { 
                "name": "Normal", 
                "score": 0.98908758163452 
            }, 
            { 
                "name": "Police Force", 
                "score": 0.0062405453063548 
            }, 
            { 
                "name": "Bloodiness", 
                "score": 0.0009653537417762 
            }, 
            { 
                "name": "Dead Body", 
                "score": 0.001054480439052 
            }, 
            { 
                "name": "Explosion Fire", 
                "score": 0.00011743687355192 
            }, 
            { 
                "name": "Homicide", 
                "score": 0.0011699661845341 
            }, 
            { 
                "name": "Riot", 
                "score": 0.000021190358893364 
            }, 
            { 
                "name": "Violent-terrorist figure" 
                "score": 0.0010401027975604 
            }, 
            { 
                "name": "Military Weapon", 
                "score": 0.00030337597127073 
            } 
        ] 
    } 
}

Public

Feature

Public figure recognition

Request Parameter

Parameter	Is it required	Type	Description
max_face_num	No	uint32	Maximum number of faces to be processed, default value of 1 and maximum value of 5
max_star_num	No	uint32	Maximum number of similar stars for single face, default value 4

Return Parameter

Parameter	Subparameter	Subparameter	Type	Required	Description
log_id	-	-	uint64	Yes	Log id
result_num	-	-	uint32	Yes	Number of faces actually detected (not greater than max_face_num)
result	-	-	object[]	Yes
-	location	-	object	Yes	Face position in the input image
-	-	left	uint32	Yes	Distance of face region from left border
-	-	top	uint32	Yes	Distance of face region from upper border
-	-	width	uint32	Yes	Face region width
-	-	height	uint32	Yes	Face region height
-	stars	-	object[]	Yes	Public figure array
-	-	name	string	Yes	Name
-	-	star_id	string	Yes	Character id, globally unique
-	-	probability	float	Yes	Similarity, [0, 1]

Return Example

{ 
    "log_id": 3268660173,
    "result_num": 1,
    "result": [ 
        { 
            "location": { 
                "left": 132, 
                "top": 168, 
                "width": 238, 
                "height": 223 
            }, 
            "stars": [ 
                { 
                    "name": "Xi Jinping", 
                    "star_id": "515617",
                    "probability": 0.9750030040741 
                } 
            ] 
        } 
    ] 
}

Disgust

Feature

Disgust photo recognition

Request Parameter

None

Return Parameter

Parameter	Type	Required	Description
log_id	uint64	Yes	Request identification code, random number, unique.
result	double	Yes	Score

Return Example

{ 
	 "result": 9.2708455667889E-7, 
	"log_id": 2977989308
}

Eatermark

Feature

Watermark detection

Request Parameter

None

Return Value

Parameter	Type	Required	Description
log_id	uint64	Yes	Request identification code, random number, unique.
result_num	uint32	No	Return the number of results, that is: Number of elements in result array
result	array(object)	No	Return result array, each item being a detected result

Each content of elements in result contains the following fields:

Parameter	Type	Required	Description
location	object	No	Position information (pixel position from left, pixel position from top, pixel width, pixel height)
probability	double	Yes	Classification result confidence [0-1.0]
type	string	Yes	Return result type (watermark, bar code, QR code)

Return Example

{ 
	 "result": [{ 
	 	 "probability": 0.99872654676437, 
	 	 "type": "watermark" 
	 }, 
	 { 
	 	 "probability": 0.98578763008118, 
	 	 "type": "watermark" 
	 }], 
	"log_id": 686882979,
	"result_num": 2
}

Quality

Feature

photo quality

Request Parameter

None

Return Parameter

Parameter	Subparameter	Type	Required	Description
log_id	-	uint64	Yes	Log id
result	-	object[]	Yes
-	aesthetic	double	Yes	Aesthetics
-	clarity	double	Yes	Sharpness

Return Example

{ 
	 "result": { 
	 	 "aesthetic": 0.26410301526388, 
	 	 "clarity": 0.28039423624674 
	 }, 
	"log_id": 93316007
}

Event Notification

Static Website Trusteeship

百度智能云

Object Storage

Photo Review

Overview

Charging Mode

Subscribing Photo Review Service

API Interface

Ocr

Face

Antiporn

Politician

Terror

Public

Disgust

Eatermark

Quality