Baidu AI Cloud
中国站

百度智能云

Data Warehouse

INSERT

INSERT

Description

INSERT INTO table_name
[ PARTITION (p1, ...) ]
[ WITH LABEL label]
[ (column [, ...]) ]
{ VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query }
  • table_name

    The name of the table that needs to load data.

  • PARTITION

    Specify the partition that needs to load data.

  • WITH LABEL

    Specify a Label for this INSERT operation. If not specified, the system will automatically generate a random ID.

  • column

    Specify the target column order

  • VALUES | query

    INSERT operation supports two ways. One is to write single-line data through the VALUES statement, that is, constant expression. The other is to use Query statement to query data from other tables and load them.

Example

  1. Load a row of data into the test table. The first sentence and the second sentence have the same effect. When no target column is specified, the column order in the table is used as the default target column.

    The meaning expressed in the third and fourth statements is the same, and the default value of c2 column is used to complete the data load.

  2. Load multiple rows of data into the testtable

    INSERT INTO test VALUES (1, 2), (3, 2 + 2);
    
    INSERT INTO test (c1, c2) VALUES (1, 2), (3, 2 * 2);
    
    INSERT INTO test (c1) VALUES (1), (3);
    
    INSERT INTO test (c1, c2) VALUES (1, DEFAULT), (3, DEFAULT);

    The first statement and the second statement have the same effect, and load two pieces of data into thetest table at one time

    The effect of the third and fourth statements is known, and the default value of c2column is used to load two pieces of data into thetest table

  3. Load a query statement result into the test table

    INSERT INTO test SELECT * FROM test2;
    
    INSERT INTO test (c1, c2) SELECT * from test2;
    
    INSERT INTO tbl1 WITH LABEL label1
    WITH cte1 AS (SELECT * FROM tbl1), cte2 AS (SELECT * FROM tbl2)
    SELECT k1 FROM cte1 JOIN cte2 WHERE cte1.k1 = 1;
  4. Load a query statement result into the test table and specify the partition and label

    INSERT INTO test PARTITION(p1, p2) WITH LABEL `label1` SELECT * FROM test2;
    INSERT INTO test WITH LABEL `label1` (c1, c2) SELECT * from test2;

Keywords

INSERT

Best Practices

  1. View the returned results

    INSERT operation is a synchronous operation, and the result returned indicates the end of the operation. Users need to carry out corresponding processing according to different returned results.

    1. Execution succeeded, and the result set was empty

      If the result set of the select statement corresponding to insert is empty, it returns as follows:

      mysql> insert into tbl1 select * from empty_tbl;
      Query OK, 0 rows affected (0.02 sec)

      Query OK indicates successful execution.0 rows affected indicates that no data has been loaded.

    2. Execution succeeded, and the result set was not empty

      When the result set is not empty. The returned results can be divided into the following situations:

      1. Insert successfully executed and visible:

        mysql> insert into tbl1 select * from tbl2;
        Query OK, 4 rows affected (0.38 sec)
        {'label':'insert_8510c568-9eda-4173-9e36-6adc7d35291c', 'status':'visible', 'txnId':'4005'}
        
        mysql> insert into tbl1 with label my_label1 select * from tbl2;
        Query OK, 4 rows affected (0.38 sec)
        {'label':'my_label1', 'status':'visible', 'txnId':'4005'}
        
        mysql> insert into tbl1 select * from tbl2;
        Query OK, 2 rows affected, 2 warnings (0.31 sec)
        {'label':'insert_f0747f0e-7a35-46e2-affa-13a235f4020d', 'status':'visible', 'txnId':'4005'}
        
        mysql> insert into tbl1 select * from tbl2;
        Query OK, 2 rows affected, 2 warnings (0.31 sec)
        {'label':'insert_f0747f0e-7a35-46e2-affa-13a235f4020d', 'status':'committed', 'txnId':'4005'}

        Query OK indicates successful execution.4 rows affected indicates that total 4 rows of data have been loaded.2 warnings indicates the number of rows filtered.

        At the same time, a json string is returned:

        {'label':'my_label1', 'status':'visible', 'txnId':'4005'}
        {'label':'insert_f0747f0e-7a35-46e2-affa-13a235f4020d', 'status':'committed', 'txnId':'4005'}
        {'label':'my_label1', 'status':'visible', 'txnId':'4005', 'err':'some other error'}

        label is a user-specified label or an automatically generated label. Label is the id of the Insert Into load job. Every load job has a unique Label within a single database.

        status indicates whether the loaded data is visible. If visible, show visible; if not, show committed

        txnId is the id of the load transaction corresponding to this insert.

        err field displays some other unexpected errors.

        When it is required to view the filtered rows, the user can use the following statement

        show load where label="xxx";

        The URL in the returned result can be used to query the wrong data. Details can be found in the summary of View Error Line.

        Invisible data is a temporary state, and this batch of data will eventually be visible

        The visible status of this batch of data can be viewed through the following statement:

        show transaction where id=4005;

        If the TransactionStatus column in the returned result is visible, the representation data is visible.

    3. Execution failed

      Execution failure means that no data has been successfully loaded, and returns as follows:

      mysql> insert into tbl1 select * from tbl2 where k1 = "a";
      ERROR 1064 (HY000): all partitions have no load data. url: http://10.74.167.16:8042/api/_load_error_log?file=__shard_2/error_log_insert_stmt_ba8bb9e158e4879-ae8de8507c0bf8a2_ba8bb9e158e4879_ae8de8507c0bf8a2

      In which, ERROR 1064 (HY000): all partitions have no load data shows the failure reason. The following url can be used to query the wrong data:

      show load warnings on "url";

      Specific error lines can be viewed.

  2. Timeout period

    The timeout for INSERT operation is determined by Session variable query_timeout , which is 5 min by default. If it times out, the job will be cancelled.

  3. Label and atomicity

    INSERT operation can also guarantee the atomicity of load. Please refer to Load transactions and atomicity document.

    When CTE(Common Table Expressions) needs to be used as the query part in insert operation, WITH LABEL and column must be specified.

  4. Filtering threshold

    Unlike other load methods, INSERT operation cannot specify(max_filter_ratio). The default filtering threshold is 1, that is, all lines with errors can be ignored.

    For business scenarios that require data that cannot be filtered, user can set Session variable enable_insert_strict as true to ensure that theINSERT will not be executed successfully when when data is filtered out.

  5. Performance problem

    It is not suggested to insert a single line by using the VALUES method. If it must be used in this way, please combine multiple rows of data into one INSERT statement for batch submission.

Previous
SELECT-INTO-OUTFILE
Next
Information View Statement