Select file
The BOS Java SDK provides the SelectObject API, which is used to execute SQL statements on specified objects in the bucket and return the selected content. Please refer to [Select Object](BOS/API Reference/Object-Related Interface/Select scanning/SelectObject.md). Currently, supported object types are CSV (including TSV and other CSV-like files), JSON and Parquet files:
For the sample code, please refer to File Selection Demo
- Select CSV files
- Select JSON files
Select CSV files
For selecting CSV files with the Java SDK, please refer to the following code:
1final String csvContent = "header1,header2,header3\r\n" +
2 "1,2,3.4\r\n" +
3 "a,b,c\r\n" +
4 "\"d\",\"e\",\"f\"\r\n" +
5 "true,false,true\r\n" +
6 "2006-01-02 15:04:06,2006-01-02 16:04:06,2006-01-02 17:04:06";
7client.putObject("bucketName", "test-csv", new ByteArrayInputStream(csvContent.getBytes()));
8SelectObjectRequest request = new SelectObjectRequest("bucketName", "test-csv")
9 .withSelectType("csv")
10 .withExpression("select * from BosObject limit 3")
11 .withInputSerialization(new InputSerialization()
12 .withCompressionType("NONE")
13 .withFileHeaderInfo("NONE")
14 .withRecordDelimiter("\r\n")
15 .withFieldDelimiter(",")
16 .withQuoteCharacter("\"")
17 .withCommentCharacter("#"))
18 .withOutputSerialization(new OutputSerialization()
19 .withOutputHeader(false)
20 .withQuoteFields("ALWAYS")
21 .withRecordDelimiter("\n")
22 .withFieldDelimiter(",")
23 .withQuoteCharacter("\""))
24 .withRequestProgress(false);
25SelectObjectResponse response = client.selectObject(request);
26 // Output the returned records
27SelectObjectResponse.Messages messages = response.getMessages();
28while (messages.hasNext()) {
29 SelectObjectResponse.CommonMessage message = messages.next();
30 if (message.Type.equals("Records")) {
31 for (String record: message.getRecords()) {
32 System.out.println(record);
33 }
34 }
35}
Results of selecting CSV files:
1"header1","header2","header3"
2"1","2","3.4"
3"a","b","c"
Note:
- In Unix/Linux systems, each line ends with only a "
", i.e., "\n"; - In Windows systems, each line ends with a "
", i.e., "\r\n"; - In Mac systems, each line ends with a "
", i.e., "\n". Only Mac OS before version 9 uses '\r'. - Set the recordDelimiter appropriately based on the file content.
Select JSON files
For selecting JSON files with the Java SDK, please refer to the following code:
1final String jsonContent = "{\n" +
2 "\t\"name\": \"Smith\",\n" +
3 "\t\"age\": 16,\n" +
4 "\t\"org\": null\n" +
5 "}\n" +
6 "{\n" +
7 "\t\"name\": \"charles\",\n" +
8 "\t\"age\": 27,\n" +
9 "\t\"org\": \"baidu\"\n" +
10 "}\n" +
11 "{\n" +
12 "\t\"name\": \"jack\",\n" +
13 "\t\"age\": 35,\n" +
14 "\t\"org\": \"bos\"\n" +
15 "}";
16client.putObject("bucketName", "test-json", new ByteArrayInputStream(jsonContent.getBytes()));
17SelectObjectRequest request = new SelectObjectRequest("bucketName", "test-json")
18 .withSelectType("json")
19 .withExpression("select * from BosObject where age > 20")
20 .withInputSerialization(new InputSerialization()
21 .withCompressionType("NONE")
22 .withJsonType("LINES"))
23 .withOutputSerialization(new OutputSerialization()
24 .withRecordDelimiter("\n"))
25 .withRequestProgress(false);
26SelectObjectResponse response = client.selectObject(request);
27 // Output the returned records
28SelectObjectResponse.Messages messages = response.getMessages();
29while (messages.hasNext()) {
30 SelectObjectResponse.CommonMessage message = messages.next();
31 if (message.Type.equals("Records")) {
32 for (String record: message.getRecords()) {
33 System.out.println(record);
34 }
35 }
36}
Results of selecting JSON files:
1{"name":"charles","age":27,"org":"baidu"}
2{"name":"jack","age":35,"org":"bos"}
Select Parquet files
For selecting Parquet files with the Java SDK, please refer to the following code:
1/*
2 Content parsed from Parquet files
3{"Name":"StudentName","Age":20,"Id":0,"Weight":50,"Sex":true,"Day":19240,"Scores":{"computer":80,"math":90,"physics":90}}
4{"Name":"StudentName","Age":21,"Id":1,"Weight":50.1,"Sex":false,"Day":19240,"Scores":{"computer":81,"math":91,"physics":91}}
5{"Name":"StudentName","Age":22,"Id":2,"Weight":50.2,"Sex":true,"Day":19240,"Scores":{"computer":82,"math":92,"physics":92}}
6{"Name":"StudentName","Age":23,"Id":3,"Weight":50.3,"Sex":false,"Day":19240,"Scores":{"computer":83,"math":93,"physics":90}}
7{"Name":"StudentName","Age":24,"Id":4,"Weight":50.4,"Sex":true,"Day":19240,"Scores":{"computer":84,"math":94,"physics":91}}
8{"Name":"StudentName","Age":20,"Id":5,"Weight":50.5,"Sex":false,"Day":19240,"Scores":{"computer":85,"math":90,"physics":92}}
9{"Name":"StudentName","Age":21,"Id":6,"Weight":50.6,"Sex":true,"Day":19240,"Scores":{"computer":86,"math":91,"physics":90}}
10{"Name":"StudentName","Age":22,"Id":7,"Weight":50.7,"Sex":false,"Day":19240,"Scores":{"computer":87,"math":92,"physics":91}}
11{"Name":"StudentName","Age":23,"Id":8,"Weight":50.8,"Sex":true,"Day":19240,"Scores":{"computer":88,"math":93,"physics":92}}
12{"Name":"StudentName","Age":24,"Id":9,"Weight":50.9,"Sex":false,"Day":19240,"Scores":{"computer":89,"math":94,"physics":90}}
13*/
14SelectObjectRequest request = new SelectObjectRequest("bucketName", "test-parquet")
15 .withSelectType("parquet")
16 .withExpression("select * from BosObject s where s.Scores.computer > 85")
17 .withInputSerialization(new InputSerialization()
18 .withCompressionType("NONE"))
19 .withOutputSerialization(new OutputSerialization()
20 .withRecordDelimiter("\n"))
21 .withRequestProgress(false);
22SelectObjectResponse response = client.selectObject(request);
23 // Output the returned records
24SelectObjectResponse.Messages messages = response.getMessages();
25while (messages.hasNext()) {
26 SelectObjectResponse.CommonMessage message = messages.next();
27 if (message.Type.equals("Records")) {
28 for (String record: message.getRecords()) {
29 System.out.println(record);
30 }
31 }
32}
Results of selecting Parquet files:
1{"Name":"StudentName","Age":21,"Id":6,"Weight":50.6,"Sex":true,"Day":19240,"Scores":{"computer":86,"math":91,"physics":90}}
2{"Name":"StudentName","Age":22,"Id":7,"Weight":50.7,"Sex":false,"Day":19240,"Scores":{"computer":87,"math":92,"physics":91}}
3{"Name":"StudentName","Age":23,"Id":8,"Weight":50.8,"Sex":true,"Day":19240,"Scores":{"computer":88,"math":93,"physics":92}}
4{"Name":"StudentName","Age":24,"Id":9,"Weight":50.9,"Sex":false,"Day":19240,"Scores":{"computer":89,"math":94,"physics":90}}
Note that when querying CSV, JSON and Parquet files, the parameters for initializing SelectObjectRequest differ significantly. For detailed parameter settings, please refer to [SelectObject API](BOS/API Reference/Object-Related Interface/Select scanning/SelectObject.md).
