Photo Review
Overview
Photo audit service is an photo intelligent review service provided by Baidu AI Cloud. It supports review of multiple dimensions of photo, including pornographic recognition, violent-terrorist recognition, politically sensitive recognition, disgust photo recognition and advertising recognition. photo review application scenarios have a wide range, covering games, social networking, forums, life services, and UGC. If needed, game makers may review the avatar uploaded by the player with the assistance of photo review service.
With the in-depth integration of photo review service and BOS, you can call the photo review capability through BOS API, SDK, and CLI.
Charging Mode
As the call entry of photo review service, BOS only charges the BOS API request fee. For details, please refer to BOS Pricing.
The fee for photo review service is charged through the photo review product.
Subscribing Photo Review Service
At present, Baidu AI Cloud supports the review of photo on BOS. You can upload an object to BOS through Console, SDK, and CLI.
1.To subscribe the photo review service, you need to choose "photo review" in the "data processing" module on the global overview page and click "use right now".
After the subscription, the user can call the photo review function through BOS CLI Tools or a BOS API interface. The details of API interface are described below.
API Interface
This interface is used to request photo audit service.
Note: You can use this interface only if you have subscribed the photo review service in the console.
Request
-
Request syntax
POST <ObjectName>?process HTTP/1.1 Host: <BucketName>.bj.bcebos.com Date: <Date> Authorization: <AuthorizationString> Content-Type: application/json; charset=utf-8 Content-Length: <ContentLength> { "action" : { "sync" : [{ "url" : "$(img-censor)", "parameters" : "<base64_encode(param)>", } ] } }
-
Request parameter
- url The fixed value is
$(img-censor)
, which does not need to be modified. - parameters This value is base64 encoded for photo audit functional parameter. The photo audit functional parameter is still a json string.
Basic structure of photo audit functional parameter:
{ "antiporn" : {}, "terror" : {}, "ocr" : { "detect_direction" : "false", "language_type" : "CHN_ENG", "recognize_granularity" : "big" } }
It can be any combination of multiple sub-services on demand, or a single sub-service such as "porn identification". At this time, the parameter is written as
{ "antiporn" : {} }
- url The fixed value is
The lists of sub-service supported by the photo audit service are:
- ocr Universal text recognition
- face Face detection
- antiporn Pornographic recognition
- politician Political figure recognition
- terror Violent-terrorist recognition -public public figure recognition -disgust disgust photo recognition
- watermark Watermark QR code recognition
-
quality The value of photo quality recognition is the underlying service parameter. See the input parameter of Hetu OCR Universal Text Recognition Service.
For detailed explanations of sub-service request parameter and return parameter, see the detailed explanation below.
-
Request header
None
Response
- Response header fields None
-
Response element
See the return parameter explanation of each sub-service.
Example
-
Request example
{ "antiporn": {}, "ocr": { "detect_direction": "false", "language_type": "CHN_ENG", "recognize_granularity": "big" } }
After base64 encoding,
eyJhbnRpcG9ybiI6e30sIm9jciI6eyJkZXRlY3RfZGlyZWN0aW9uIjoiZmFsc2UiLCJsYW5ndWFnZV90eXBlIjoiQ0hOX0VORyIsInJlY29nbml6ZV9ncmFudWxhcml0eSI6ImJpZyJ9fQ==
It is filled in parameters, the request sent is as follows:
POST <ObjectName>?process HTTP/1.1
Host: <BucketName>.bj.bcebos.com
Date: <Date>
Authorization: <AuthorizationString>
Content-Type: application/json; charset=utf-8
Content-Length: <ContentLength>
{
"action" : {
"sync" : [{
"url" : "$(img-censor)",
"parameters" : "eyJhbnRpcG9ybiI6e30sIm9jciI6eyJkZXRlY3RfZGlyZWN0aW9uIjoiZmFsc2UiLCJsYW5ndWFnZV90eXBlIjoiQ0hOX0VORyIsInJlY29nbml6ZV9ncmFudWxhcml0eSI6ImJpZyJ9fQ==",
}
]
}
}
-
Successful response example
HTTP/1.1 200 OK Date: Thu, 22 Jun 2017 07:30:56 GMT Content-Type: application/json; charset=utf-8 Content-Length: 237 Connection: keep-alive Server: BceBos x-bce-debug-id: MTAuNzUuNzguNDA6VGh1LCAyMiBKdW4gMjAxNyAxNTozMDo1NiBDU1Q6MTg1MDY5NDg3OQ== x-bce-request-id: 598f7e18-77fb-424a-bc68-95acb0644076 { "result" : { "antiporn" : { "result" : [{ "probability" : 0.000071, "class_name" : "Pornographic" }, { "probability" : 0.000291, "class_name" : "Sexy" }, { "probability" : 0.999638, "class_name" : "Normal" } ], "log_id" : 1853911322, "result_num" : 3 }, "ocr" : { "log_id" : 2471272194, "words_result_num" : 2, "words_result" : [{ "words" : " TSINGTAO" }, { "words" : "Qingdao" } ] } }, "log_id" : 149811665151162 }
-
Failure return example
{ "log_id": 149319909347709, "error_code": 216500, "error_msg": "unknown error" }
Failure error_code explanation:
Error code Error message Description 216101 not enough param Insufficient parameters 216102 service not support An unsupported underlying service type was entered 216200 empty photo No photo Url 216500 unknown error unknown error 282804 download photo error download photo error 282000 logic internal error logic internal error
Ocr
Feature
The user requests a service to identify all text in an photo.
Request Parameter
Parameter | Is it required | Type | Optional value range | Description |
---|---|---|---|---|
language_type | No | string | CHN_ENG, ENG, POR, FRE, GER, ITA, SPA, RUS, JAP, KOR | Recognition language type is CHN_ENG by default. Optional values are: - CHN_ENG Mixed Chinese and English; -ENG: English; -POR: Portuguese; -FRE: French; -GER: German; -ITA: Italian; -SPA: Spanish; -RUS: Russian; -JAP: Japanese;-KOR: Korean |
detect_direction | No | boolean | true、false | Whether to detect the orientation of the image, which is not detected by default, that is: false. Orientation means that the input image is normal and rotated 90/180/270 degrees counterclockwise. Optional values include:-true: Detect orientation;-false: No orientation is detected. |
detect_language | No | string | true-false | Whether to detect language, which is not detected by default. Currently support (Chinese, English, Japanese, Korean) |
probability | No | string | true-false | Whether to return confidence of each row in recognition result |
Return parameter
Field | Is it required | Type | Description |
---|---|---|---|
direction | No | int32 | Image orientation exists when detect_direction = true. --1: undefined, -0: forward, -1: 90 degrees counterclockwise, -2: 180 degrees counterclockwise, -3: 270 degrees counterclockwise |
log_id | Yes | uint64 | Unique log id is used for problem location |
words_result | Yes | array() | Recognition result array |
words_result_num | Yes | uint32 | Number of recognition result, indicating the number of elements of words_result |
+words | No | string | Recognition result string |
probability | No | object | Confidence value of each row in recognition result, including average: Row confidence average, variance: Row confidence variance, min: Row confidence minimum |
Return Example
{
"log_id": 2471272194,
"words_result_num": 2,
"words_result":
[
{"words": " TSINGTAO"},
{"words": "Tsingtao Beer"}
]
}
Face
Feature
- Detect the face in the requested photo, and return the face position, 72 key point coordinates and face association attributes information.
- Detection response speed is related to the number of faces in the photo. The response time will be slightly longer when the number of faces is large.
- Typical application scenarios: Such as Face attribute analysis, Processing analysis based on key points of face, Face marketing campaign.
- The positions of the facial features are marked with specific coordinates; the coordinates of the 72 key points also include specific coordinates, but do not include detailed descriptions of corresponding positions.
Request Parameter
None
Return Parameter
Parameter | Type | Required or not | Description |
---|---|---|---|
log_id | uint64 | Yes | Log id |
result_num | uint32 | Yes | Number of faces |
result | object[] | Yes | Collection of face attribute objects |
+age | double | No | age. Return when face_fields includes age |
+beauty | double | No | Beauty and ugliness scoring, the range is 0-100, the bigger the more beautiful. Return when face_fields includes beauty |
+location | object | Yes | Face position in the image |
++left | uint32 | Yes | Distance of face region from left border |
++top | uint32 | Yes | Distance of face region from upper border |
++width | uint32 | Yes | Face region width |
++height | uint32 | Yes | Face region height |
+face_probability | double | Yes | Face confidence ranges from 0 to1 |
+rotation_angle | int32 | Yes | Clockwise rotation angle of face frame relative to vertical direction, [-180,180] |
+yaw | double | Yes | Left and right rotation angle of 3D rotation [-90 (left), 90 (right)] |
+pitch | double | Yes | Pitch angle of 3D rotation [-90 (up), 90 (down)] |
+roll | double | Yes | Rotation angle in the plane [-180 (counterclockwise), 180 (clockwise)] |
+expression | uint32 | No | Face expression, 0, do not laugh; 1, smile; 2, laugh. Return when face_fields includes expression |
+expression_probability | double | No | Expression confidence ranges from 0 to 1.Return when face_fields includes expression |
+faceshape | object[] | No | Face confidence. Return when face_fields includes faceshape |
++type | string | Yes | Face type: square/triangle/oval/heart/round |
++probability | double | Yes | Confidence: 0~1 |
+gender | string | No | male, female. Return when face_fields includes gender |
+gender_probability | double | No | Gender confidence ranges from 0 to 1.Return when face_fields includes gender |
+glasses | uint32 | No | Whether towear glasses, 0-without glasses, 1-ordinary glasses, 2-sunglasses. Return when face_fields includes glasses |
+glasses_probability | double | No | Glasses confidence ranges from 0 to 1.Return when face_fields includes glasses |
+landmark | object[] | No | 4 key point locations, left eye center, right eye center, nose tip and mouth center. Return when face_fields includes landmark |
++x | uint32 | No | x coordinate |
++y | uint32 | No | y coordinate |
+landmark72 | object[] | No | 72 feature point positions, example image . Return when face_fields includes landmark |
++x | uint32 | No | x coordinate |
++y | uint32 | No | y coordinate |
+race | string | No | yellow、white、black、arabs. Return when face_fields includes race |
+race_probability | double | No | Race confidence ranges from 0 to 1.Return when face_fields includes race |
+qualities | object | No | Face quality information. Return when face_fields includes qualities |
++occlusion | object | Yes | Probability of occlusion of each part of the face, [0, 1](to be online) |
+++left_eye | double | Yes | Left eye |
+++right_eye | double | Yes | Right eye |
+++nose | double | Yes | Nose |
+++mouth | double | Yes | Mouth |
+++left_cheek | double | Yes | Left cheek |
+++right_cheek | double | Yes | Right cheek |
+++chin | double | Yes | Chin |
++blur | double | Yes | Face blur degree, [0, 1]. 0 means clear, 1 means fuzzy (to be online) |
++illumination | - | Yes | Value range is [0,255], indicating light level of face region (to be online) |
++completeness | - | Yes | Face completeness, [0, 1]. 0 means complete, 1 means incomplete (to be online) |
++type | object | Yes | Real face/cartoon face confidence |
+++human | - | Yes | Real face confidence, [0, 1] |
+++cartoon | - | Yes | Cartoon face confidence, [0, 1] |
Return Example
{
"result_num": 1,
"result": [
{
"location": {
"left": 90,
"top": 92,
"width": 111,
"height": 99
},
"face_probability": 1,
"rotation_angle": 6,
"yaw": 11.61234664917,
"pitch": -0.30852827429771,
"roll": 8.8044967651367,
"landmark": [
{
"x": 105,
"y": 110
},
...
],
"landmark72": [
{
"x": 88,
"y": 109
},
...
],
"gender": "male",
"gender_probability": 0.99358034133911,
"glasses": 0,
"glasses_probability": 0.99991309642792,
"race": "yellow",
"race_probability": 0.99960690736771,
"qualities": {
"occlusion": {
"left_eye": 0.000085282314103097,
"right_eye": 0.00001094374601962,
"nose": 3.2677664307812e-7,
"mouth": 2.6582130940866e-10,
"left_cheek": 8.752236624332e-8,
"right_cheek": 1.0212766454742e-7,
"chin": 4.2632994357028e-10
},
"blur": 4.5613666312237e-41,
"illumination": 0,
"completeness": 0,
"type": {
"human": 0.98398965597153,
"cartoon": 0.016010366380215
}
}
}
],
"log_id": 2418894422
}
Antiporn
Feature
Pornographic recognition.
Request Parameter
None
Return Parameter
Field | Type | Required or not | Description |
---|---|---|---|
confidence_coefficient | string | Yes | Whether the result is determined, which is divided into two types of "determined" and "undetermined". |
result_num | uint32 | Yes | Return the number of results, that is: Number of elements in result array. |
result | array(array(double)) | Yes | Result array, each content corresponds to the result of a classification dimension. |
conclusion | string | Yes | The final identification result of this image is divided into three types of "pornographic", "sexy" and "normal". |
log_id | uint64 | Yes | Request identification code, random number, unique. |
Each of these elements contains the following fields:
Field | Type | Required or not | Description | Example |
---|---|---|---|---|
class_name | string | Yes | Classification result name | Pornographic |
probability | double | Yes | Classification result confidence | 0.89471650123596 |
Return Example
{
"result": [{
"probability": 0.000301,
"class_name": "Pornographic"
},
{
"probability": 0.000054,
"class_name" : " sexy"
},
{
"probability": 0.999645,
"class_name": "Normal"
}],
"conclusion": "Normal",
"log_id": 848999404,
"confidence_coefficient": "Determined",
"result_num": 3
}
Politician
Feature
Political figure recognition.
Request Parameter
None
Return Parameter
Parameter | Subparameter | Required | Type | Required or not | Description |
---|---|---|---|---|---|
log_id | - | - | uint64 | Yes | Log id |
result_num | - | - | uint32 | Yes | Number of faces actually detected (not greater than max_face_num) |
result | - | - | object[] | Yes | |
- | location | - | object | Yes | Face position in the input image |
- | - | left | uint32 | Yes | Distance of face region from left border |
- | - | top | uint32 | Yes | Distance of face region from upper border |
- | - | width | uint32 | Yes | Face region width |
- | - | height | uint32 | Yes | Face region height |
- | stars | - | object[] | Yes | Public figure array |
- | - | name | string | Yes | Name |
- | - | star_id | string | Yes | Character id, globally unique |
- | - | probability | float | Yes | Similarity, [0, 1] |
Return Example
{
"log_id": 3268660173,
"result_num": 1,
"result": [
{
"location": {
"left": 132,
"top": 168,
"width": 238,
"height": 223
},
"stars": [
{
"name": "Xi Jinping",
"star_id": "515617",
"probability": 0.9750030040741
}
]
}
]
}
Terror
Feature
Violent-terrorist recognition
Request Parameter
None
Return Parameter
Field | Type | Required or not | Description |
---|---|---|---|
result | array(array(double)) | Yes | Violent-terrorist confidence score. |
log_id | uint64 | Yes | Request identification code, random number, unique. |
result_coarse | object[] | Yes | Coarse-grained score result |
name | string | Yes | Coarse-grained tags include two tags: Normal, Violent-terrorist |
score | float | Yes | Confidence score of the corresponding tag, the large the score is, the higher the confidence is |
result_fine | object[ ] | Yes | Fine-grained score result |
name | string | Yes | Fine-grained tags include 9 tags: Normal, police force, bloodiness, dead body, explosion fire, homicide, riot, violent-terrorist figure, military weapon |
score | float | Yes | Confidence score of the corresponding tag, the large the score is, the higher the confidence is |
Return Example
{
"errno": 0,
"msg": "success",
"data": {
"result": 0.0082325544208288,
"result_coarse": [
{
"name": "Normal",
"score": 0.99176746606827
},
{
"name": "Violent-terrorist ",
"score": 0.0082325544208288
}
],
"result_fine": [
{
"name": "Normal",
"score": 0.98908758163452
},
{
"name": "Police Force",
"score": 0.0062405453063548
},
{
"name": "Bloodiness",
"score": 0.0009653537417762
},
{
"name": "Dead Body",
"score": 0.001054480439052
},
{
"name": "Explosion Fire",
"score": 0.00011743687355192
},
{
"name": "Homicide",
"score": 0.0011699661845341
},
{
"name": "Riot",
"score": 0.000021190358893364
},
{
"name": "Violent-terrorist figure"
"score": 0.0010401027975604
},
{
"name": "Military Weapon",
"score": 0.00030337597127073
}
]
}
}
Public
Feature
Public figure recognition
Request Parameter
Parameter | Is it required | Type | Description |
---|---|---|---|
max_face_num | No | uint32 | Maximum number of faces to be processed, default value of 1 and maximum value of 5 |
max_star_num | No | uint32 | Maximum number of similar stars for single face, default value 4 |
Return Parameter
Parameter | Subparameter | Subparameter | Type | Required | Description |
---|---|---|---|---|---|
log_id | - | - | uint64 | Yes | Log id |
result_num | - | - | uint32 | Yes | Number of faces actually detected (not greater than max_face_num) |
result | - | - | object[] | Yes | |
- | location | - | object | Yes | Face position in the input image |
- | - | left | uint32 | Yes | Distance of face region from left border |
- | - | top | uint32 | Yes | Distance of face region from upper border |
- | - | width | uint32 | Yes | Face region width |
- | - | height | uint32 | Yes | Face region height |
- | stars | - | object[] | Yes | Public figure array |
- | - | name | string | Yes | Name |
- | - | star_id | string | Yes | Character id, globally unique |
- | - | probability | float | Yes | Similarity, [0, 1] |
Return Example
{
"log_id": 3268660173,
"result_num": 1,
"result": [
{
"location": {
"left": 132,
"top": 168,
"width": 238,
"height": 223
},
"stars": [
{
"name": "Xi Jinping",
"star_id": "515617",
"probability": 0.9750030040741
}
]
}
]
}
Disgust
Feature
Disgust photo recognition
Request Parameter
None
Return Parameter
Parameter | Type | Required | Description |
---|---|---|---|
log_id | uint64 | Yes | Request identification code, random number, unique. |
result | double | Yes | Score |
Return Example
{
"result": 9.2708455667889E-7,
"log_id": 2977989308
}
Eatermark
Feature
Watermark detection
Request Parameter
None
Return Value
Parameter | Type | Required | Description |
---|---|---|---|
log_id | uint64 | Yes | Request identification code, random number, unique. |
result_num | uint32 | No | Return the number of results, that is: Number of elements in result array |
result | array(object) | No | Return result array, each item being a detected result |
Each content of elements in result contains the following fields:
Parameter | Type | Required | Description |
---|---|---|---|
location | object | No | Position information (pixel position from left, pixel position from top, pixel width, pixel height) |
probability | double | Yes | Classification result confidence [0-1.0] |
type | string | Yes | Return result type (watermark, bar code, QR code) |
Return Example
{
"result": [{
"probability": 0.99872654676437,
"type": "watermark"
},
{
"probability": 0.98578763008118,
"type": "watermark"
}],
"log_id": 686882979,
"result_num": 2
}
Quality
Feature
photo quality
Request Parameter
None
Return Parameter
Parameter | Subparameter | Type | Required | Description |
---|---|---|---|---|
log_id | - | uint64 | Yes | Log id |
result | - | object[] | Yes | |
- | aesthetic | double | Yes | Aesthetics |
- | clarity | double | Yes | Sharpness |
Return Example
{
"result": {
"aesthetic": 0.26410301526388,
"clarity": 0.28039423624674
},
"log_id": 93316007
}