SELECT-INTO-OUTFILE
SELECT INTO OUTFILE
Description
This command is used to export the result set of SQL output to local disk or to remote storage through Broker.
query_stmt
INTO OUTFILE "file_path"
[format_as]
[properties]
-
query_stmt
Any query statement.
-
file_path
Path to file storage and file prefix. For example:
bos://my_bucket/my_file_ # remote catalog file:///to/loca/path/my_file_ # local catalog
The final file name will consist of
my_file_ + fileS/N + file format suffix
.The file serial number starts from 0, and the number is the number of files divided. For example:
my_file_0.csv my_file_1.csv my_file_2.csv
If the local file mode is used, the exported file will be stored on a random Compute Node inthe cluster. Specific node information is known in the URL of the returned result.
-
format_as
Export file format. Only support CN currently
FORMAT AS CSV
-
properties
Related properties. The properties required by broker must start with
broker.
. For example:( "broker.prop_key" = "prop_val", )
其他参数如:
column_separator
:Column separator, only applicable to CSV format. Default to \t.line_delimiter
:line delimiter, only applicable to CSV format. Default to \n.max_file_size
:The maximum size of a single file. Default to 1GB. The value range is between 5MB and 2GB. Files larger than this size will be split.success_file_name
:Whether to generate an empty file identifier after success. File name is "my_file_file_name". In which,mysq_file
is the prefix specified infile_path
,file_name
is the value of the parameter.
Description of return results:
If it exports and returns normally, the result is as follows:
mysql> select * from tbl1 limit 10 into outfile "file:///home/work/path/result_";
+------------+-----------+----------+--------------+
| FileNumber | TotalRows | FileSize | URL |
+------------+-----------+----------+--------------+
| 1 | 2 | 8 | 192.168.1.10 |
+------------+-----------+----------+--------------+
1 row in set (0.05 sec)
FileNumber
:The number of files finally generated.TotalRows
:Number of rows in result set.FileSize
:Total size of export file, in bytes.URL
:It exporting to local disk, it shows which Compute Node to export to.
If there is error in execution, an error message will be returned, such as:
mysql> SELECT * FROM tbl INTO OUTFILE ...
ERROR 1064 (HY000): errCode = 2, detailMessage = Open broker writer failed ...
Example
-
Export simple query results to file
bos://my_bucket/result_
. Specify that the export format is CSV. Usemy_broker
and set kerberos authentication information. The specified column separator is,
,and the line delimiter is\n
.SELECT * FROM tbl INTO OUTFILE "bos://my_bucket/result_" FORMAT AS CSV PROPERTIES ( "broker.name" = "bos", "broker.bos_endpoint" = "http://bj.bcebos.com", "broker.bos_accesskey" = "xxxxxxxxxxxxxxxxxxxxxxxxxx", "broker.bos_secret_accesskey" = "yyyyyyyyyyyyyyyyyyyyyyyyyy", "column_separator" = ",", "line_delimiter" = "\n", "max_file_size" = "100MB" );
If the final generated file is not larger than 100MB, then:
result_0.csv
.If it is larger than 100MB, it may be
result_0.csv, result_1.csv, ...
. -
Export the query result of CTE statement to file
bos://my_bucket/result_
.WITH x1 AS (SELECT k1, k2 FROM tbl1), x2 AS (SELECT k3 FROM tbl2) SELEC k1 FROM x1 UNION SELECT k3 FROM x2 INTO OUTFILE "bos://my_bucket/result_" PROPERTIES ( "broker.name" = "bos", "broker.bos_endpoint" = "http://bj.bcebos.com", "broker.bos_accesskey" = "xxxxxxxxxxxxxxxxxxxxxxxxxx", "broker.bos_secret_accesskey" = "yyyyyyyyyyyyyyyyyyyyyyyyyy" );
If the final generated file is not larger than 1GB, then:
result_0.csv
.If it is larger than 1GB, it may be
result_0.csv, result_1.csv, ...
. -
Export the query result of UNION statement to file bos:
bos://my_bucket/result.txt
. And an empty file identifier is generated after successful exporting.SELECT k1 FROM tbl1 UNION SELECT k2 FROM tbl1 INTO OUTFILE "bos://bucket/result_" PROPERTIES ( "broker.name" = "my_broker", "broker.bos_endpoint" = "http://bj.bcebos.com", "broker.bos_accesskey" = "xxxxxxxxxxxxxxxxxxxxxxxxxx", "broker.bos_secret_accesskey" = "yyyyyyyyyyyyyyyyyyyyyyyyyy", "success_file_name" = "SUCCESS" );
If the final generated file is not larger than 1GB, then:
result_0.parquet
.If it is larger than 1GB, it may be
result_0.parquet, result_1.parquet, ...
.A successful file is identified as
result_SUCCESS
. -
Export results to local disk.
SELECT k1 FROM tbl1 UNION SELECT k2 FROM tbl1 INTO OUTFILE "file:///local/path/result_" PROPERTIES ( "column_separator" = ",", "line_delimiter" = "\n", "max_file_size" = "100MB" );
Keywords
SELECT, INTO, OUTFILE
Best Practices
-
Export data quantity and export efficiency
The function is essentially to execute a SQL query command. The final result is single-threaded output. Therefore, the time consumption of the whole export includes the time consumption of the query itself and the time consumption of writing the final result set. If the query is large, it is necessary to set the session variable query_timeout to extend the query timeout appropriately.
-
Management of export files
Palo does not manage exported files, Including those files that were successfully exported or remained after the export failed, all need to be handled by users themselves.
-
Export to local files
The function of exporting to local files is not applicable to public cloud users, but only to users deployed in privatization. And the default user has complete control authority over the cluster nodes. Palo does not check the legality of the export path filled in by the user. If the process user of Palo does not have write access to the path, or the path does not exist, an error will be reported. At the same time, for security reasons, if a file with the same name already exists in this path, the export will also fail.
Palo does not manage files exported locally, and does not check disk space, etc. These files need to be managed by users themselves, such as cleaning.
-
Guarantee of result integrity
This command is a synchronous command, so it is possible that the task connection is disconnected during the execution, and it is impossible to live. Whether the exported data ends normally or is complete. At this time, user can use the
success_file_name
parameter to request that a successful file identifier be generated in the directory after the task is successful. And user can judge whether the export ends normally through this file.