Baidu AI Cloud
中国站

百度智能云

Object Storage

SelectObject

Interface Description

This interface is used to execute the SQL statement of the specified object in the bucket, and select and return the specified content. The requester must have the read permission of the selected object. Before SelectObject, you need to ensure that the corresponding bucket and object already exist. For more information, see the SelectObject Developer Documentation.

Request

  • Request syntax

    POST /<ObjectKey>?select&type=json/csv HTTP/1.1
    Host: <BucketName>.bj.bcebos.com
    Date: <Date>
    Authorization: <Authorization_String>
    Content-Type: application/json; charset=utf-8
    Content-Length: <Content_Length>
    {
        "selectRequest": {
            "expression": "Base64Encode(Select * from BosObject)",
            "expressionType": "SQL",
            "inputSerialization": {
                "compressionType": "GZIP/NONE",
                // JSON or CSV
            },
            "outputSerialization": {
                // JSON or CSV
            },
            "requestProgress": {
                "enabled": false/true
            }
        }
    } 
  • Request parameters

    Terms Type Description Required?
    type string The destination object type selected, which can be JSON/CSV currently. Yes
  • Request body

    JSON file

    Name Required? Type Description
    selectRequest Yes - JSON body root node
    • expression | Yes | string | Base64 encoded SQL statement
    • expressionType | Yes | string | Query statement syntax type, which can be SQL only.
    • inputSerialization | Yes | - | Input stream node, whose child node describes the queried object format information. ++compressionType | No | string | Specify whether the queried object is compressed, "NONE" or "GZIP" optional. ++ json | Yes | - | JSON node, whose child node describes the JSON file related information. +++ type | Yes | string | JSON type, queriedJSON object format, "DOCUMENT" or "LINES" optional.
    • outputSerialization | Yes | - | Output stream node, whose child node describes the return format information of the query result. ++ json | Yes | | JSON node, whose child node describes the returned JSON data related information. +++ recordDelimiter | No | string | Specify the newline character with Base64 encoding. The default value is \n (optinal)
    • requestProgress | No | - | Select the progress information node, whose child node describes the execution progress of Select operation and returns it to the user within 3s regularly. ++ enabled | No | boolean | Describe whether the progress information needs to be returned regularly, false/true optional. If the data filtration takes a long time, it may cause the timeout error 504, and you can set it to “True” to maintain the HTTP connection.

    CSV files

    Name Required? Type Description
    selectRequest Yes - JSON body root node
    + expression Yes string Base64 encoded SQL statement
    + expressionType Yes string Query statement syntax type, which can be SQL only.
    + inputSerialization Yes - Input stream node, whose child node describes the queried object format information. ++ compressionType
    + outputSerialization Yes - Output stream node, whose child node describes the return format information of query result. ++ outputHeader
    + requestProgress No - Select the progress information node, whose child node describes the execution progress of Select operation and returns it to the user within 3s regularly. ++ enabled
  • Request header field

    No special header field.

Response

  • Response header field

    Name Type Description
    Transfer-Encoding String The value is chunked, which indicates that the content is returned in the form of the HTTP1.1 encoded chunk.
  • Response parameters

    None

  • Response body

    The response of the SelectObject interface is returned in the chunk form, including Records message, Continuation message, and End message.

    message Format Description
    Records message prelude(8 byte) + n * (header_key_len(1 byte) + header_key + header_val_len(2 byte) + header_val) + payload + crc32(4 byte) It contains the data returned by the Select request, which can be one line or multiple lines of records.
    Continuation message The same as above, which is in a fixed format, but the payload content is different. It returns the current Select progress (number of bytes scanned/number of bytes returned) to the client every 3 seconds regularly and maintains the HTTP connection.
    End message The same as above, which is in a fixed format. The payload content is empty. It indicates the end of this Select request. The headers field contains error-code, error-message, message-type and bytes-scanned information.

    Detailed description of the message format:

    select-msg.png

    1. There are 8 bytes in the prelude section. The first 4 bytes represent the total length of messages, and the last 4 bytes represent the total length of headers, total chunk length (prelude the value stored in the first 4 bytes) - Total header length - prelude the 8 bytes- 4 byte of crc32 = Total length of the payload data. CRC32 indicates the erasure code of the whole message.
    2. Headers contain the following custom <key,value>: “message-type”: {"Records", "Cont", "End"}, "error-code": specific error code and "error-message": "detailed error information".
    3. Payload represents the real data returned, which can be in any format. The payload of the Continuation message contains the Select progress information represented by BytesScanned and BytesReturned fields.

Example

  • CSV file request example

    POST /object?select&type=csv HTTP/1.1
    Host: bucket.bj.bcebos.com
    Date: Thu, 15 May 2017 00:17:23 GMT
    Authorization: <Authorization_String>
    Content-Type: application/json; charset=utf-8
    Content-Length: 512
    {
        "selectRequest": {
            "expression": "c2VsZWN0IGNvdW50KCopIGZyb20gbxkl2JqZWN0IHdoZXJlIF80ID4gNDU=",
            "expressionType": "SQL",
            "inputSerialization": {
                "compressionType": "NONE",
                "csv": {
                    "fileHeaderInfo": "NONE",
                    "recordDelimiter": "Cg==",
                    "fieldDelimiter": "LA==",
                    "quoteCharacter": "Ig==",
                    "commentCharacter": "Iw=="
                }
            },
            "outputSerialization": {
                "outputHeader": FALSE,
                "csv": {
                    "quoteFields": "ALWAYS",
                    "recordDelimiter": "Cg==",
                    "fieldDelimiter": "LA==",
                    "quoteCharacter": "Ig=="
                }
            },
            "requestProgress": {
                "enabled": false
            }
        }
    }
  • CSV file response example

    HTTP/1.1 200 OK
    x-bce-request-id: 4db2b34d-654d-4d8a-b49b-3049ca786409
    Date: Wed, 06 Apr 2016 06:34:40 GMT
    ETag: "1b2cf535f27731c974343645a3985328"
    Transfer-Encoding: chunked
    Connection: close
    Server: BceBos
    ----- Body ------
    <Records message>
    ……
    <Continuation Message>
    ……
    <Records message>
    <Continuation Message>
    <End message>
  • JSON file request example

    POST /object?select&type=json HTTP/1.1
    Host: bucket.bj.bcebos.com
    Date: Thu, 15 May 2017 00:17:23 GMT
    Authorization: <Authorization_String>
    Content-Type: application/json; charset=utf-8
    Content-Length: 512
    {
        "selectRequest": {
            "expression": "c2VsZWN0IGNvdW50KCopIGZyb20gbxkl2JqZWN0IHdoZXJlIF80ID4gNDU=",
            "expressionType": "SQL",
            "inputSerialization": {
                "compressionType": "NONE",
                "json": {
                    "type": "DOCUMENT"
                }
            },
            "outputSerialization": {
                "json": {
                    "recordDelimiter": "Cg=="
                }
            },
            "requestProgress": {
                "enabled": false
            }
        }
    }
  • JSON file response example

    HTTP/1.1 200 OK
    x-bce-request-id: 4db2b34d-654d-4d8a-b49b-3049ca786409
    Date: Wed, 06 Apr 2016 06:34:40 GMT
    ETag: "1b2cf535f27731c974343645a3985328"
    Transfer-Encoding: chunked
    Connection: close
    Server: BceBos
    ----- Body ------
    <Records message>
    ……
    <Continuation Message>
    ……
    <Records message>
    <Continuation Message>
    <End message>  
Previous
Access Control
Next
Soft Link