SelectObject
Last Updated:2021-12-08
Interface Description
This interface is used to execute the SQL statement of the specified object in the bucket, and select and return the specified content. The requester must have the read permission of the selected object. Before SelectObject, you need to ensure that the corresponding bucket and object already exist. For more information, see the SelectObject Developer Documentation.
Request
-
Request syntax
POST /<ObjectKey>?select&type=json/csv HTTP/1.1 Host: <BucketName>.bj.bcebos.com Date: <Date> Authorization: <Authorization_String> Content-Type: application/json; charset=utf-8 Content-Length: <Content_Length> { "selectRequest": { "expression": "Base64Encode(Select * from BosObject)", "expressionType": "SQL", "inputSerialization": { "compressionType": "GZIP/NONE", // JSON or CSV }, "outputSerialization": { // JSON or CSV }, "requestProgress": { "enabled": false/true } } }
-
Request parameters
Terms Type Description Required? type string The destination object type selected, which can be JSON/CSV currently. Yes -
Request body
JSON file
Name Required? Type Description selectRequest Yes - JSON body root node - expression | Yes | string | Base64 encoded SQL statement
- expressionType | Yes | string | Query statement syntax type, which can be SQL only.
- inputSerialization | Yes | - | Input stream node, whose child node describes the queried object format information. ++compressionType | No | string | Specify whether the queried object is compressed, "NONE" or "GZIP" optional. ++ json | Yes | - | JSON node, whose child node describes the JSON file related information. +++ type | Yes | string | JSON type, queriedJSON object format, "DOCUMENT" or "LINES" optional.
- outputSerialization | Yes | - | Output stream node, whose child node describes the return format information of the query result. ++ json | Yes | | JSON node, whose child node describes the returned JSON data related information. +++ recordDelimiter | No | string | Specify the newline character with Base64 encoding. The default value is \n (optinal)
- requestProgress | No | - | Select the progress information node, whose child node describes the execution progress of Select operation and returns it to the user within 3s regularly. ++ enabled | No | boolean | Describe whether the progress information needs to be returned regularly, false/true optional. If the data filtration takes a long time, it may cause the timeout error 504, and you can set it to “True” to maintain the HTTP connection.
CSV files
Name Required? Type Description selectRequest Yes - JSON body root node + expression Yes string Base64 encoded SQL statement + expressionType Yes string Query statement syntax type, which can be SQL only. + inputSerialization Yes - Input stream node, whose child node describes the queried object format information. ++ compressionType + outputSerialization Yes - Output stream node, whose child node describes the return format information of query result. ++ outputHeader + requestProgress No - Select the progress information node, whose child node describes the execution progress of Select operation and returns it to the user within 3s regularly. ++ enabled -
Request header field
No special header field.
Response
-
Response header field
Name Type Description Transfer-Encoding String The value is chunked, which indicates that the content is returned in the form of the HTTP1.1 encoded chunk. -
Response parameters
None
-
Response body
The response of the SelectObject interface is returned in the chunk form, including Records message, Continuation message, and End message.
message Format Description Records message prelude(8 byte) + n * (header_key_len(1 byte) + header_key + header_val_len(2 byte) + header_val) + payload + crc32(4 byte) It contains the data returned by the Select request, which can be one line or multiple lines of records. Continuation message The same as above, which is in a fixed format, but the payload content is different. It returns the current Select progress (number of bytes scanned/number of bytes returned) to the client every 3 seconds regularly and maintains the HTTP connection. End message The same as above, which is in a fixed format. The payload content is empty. It indicates the end of this Select request. The headers field contains error-code, error-message, message-type and bytes-scanned information. Detailed description of the message format:
- There are 8 bytes in the prelude section. The first 4 bytes represent the total length of messages, and the last 4 bytes represent the total length of headers,
total chunk length (prelude the value stored in the first 4 bytes) - Total header length - prelude the 8 bytes- 4 byte of crc32 = Total length of the payload data
. CRC32 indicates the erasure code of the whole message. - Headers contain the following custom <key,value>: “message-type”: {"Records", "Cont", "End"}, "error-code": specific error code and "error-message": "detailed error information".
- Payload represents the real data returned, which can be in any format. The payload of the Continuation message contains the Select progress information represented by BytesScanned and BytesReturned fields.
- There are 8 bytes in the prelude section. The first 4 bytes represent the total length of messages, and the last 4 bytes represent the total length of headers,
Example
-
CSV file request example
POST /object?select&type=csv HTTP/1.1 Host: bucket.bj.bcebos.com Date: Thu, 15 May 2017 00:17:23 GMT Authorization: <Authorization_String> Content-Type: application/json; charset=utf-8 Content-Length: 512 { "selectRequest": { "expression": "c2VsZWN0IGNvdW50KCopIGZyb20gbxkl2JqZWN0IHdoZXJlIF80ID4gNDU=", "expressionType": "SQL", "inputSerialization": { "compressionType": "NONE", "csv": { "fileHeaderInfo": "NONE", "recordDelimiter": "Cg==", "fieldDelimiter": "LA==", "quoteCharacter": "Ig==", "commentCharacter": "Iw==" } }, "outputSerialization": { "outputHeader": FALSE, "csv": { "quoteFields": "ALWAYS", "recordDelimiter": "Cg==", "fieldDelimiter": "LA==", "quoteCharacter": "Ig==" } }, "requestProgress": { "enabled": false } } }
-
CSV file response example
HTTP/1.1 200 OK x-bce-request-id: 4db2b34d-654d-4d8a-b49b-3049ca786409 Date: Wed, 06 Apr 2016 06:34:40 GMT ETag: "1b2cf535f27731c974343645a3985328" Transfer-Encoding: chunked Connection: close Server: BceBos ----- Body ------ <Records message> …… <Continuation Message> …… <Records message> <Continuation Message> <End message>
-
JSON file request example
POST /object?select&type=json HTTP/1.1 Host: bucket.bj.bcebos.com Date: Thu, 15 May 2017 00:17:23 GMT Authorization: <Authorization_String> Content-Type: application/json; charset=utf-8 Content-Length: 512 { "selectRequest": { "expression": "c2VsZWN0IGNvdW50KCopIGZyb20gbxkl2JqZWN0IHdoZXJlIF80ID4gNDU=", "expressionType": "SQL", "inputSerialization": { "compressionType": "NONE", "json": { "type": "DOCUMENT" } }, "outputSerialization": { "json": { "recordDelimiter": "Cg==" } }, "requestProgress": { "enabled": false } } }
-
JSON file response example
HTTP/1.1 200 OK x-bce-request-id: 4db2b34d-654d-4d8a-b49b-3049ca786409 Date: Wed, 06 Apr 2016 06:34:40 GMT ETag: "1b2cf535f27731c974343645a3985328" Transfer-Encoding: chunked Connection: close Server: BceBos ----- Body ------ <Records message> …… <Continuation Message> …… <Records message> <Continuation Message> <End message>