Baidu AI Cloud
中国站

百度智能云

Data Warehouse

Data Delete

Palo delete data in the following ways:

  • TRUNCATE, this command aims to clear the table or section directly, but not to delete the corresponding metadata. It is suggested to use this command when the operation cost is low and when there is a need to clear the data.
  • DROP, this command aims to delete table or section, together with data and metadata simultaneously.
  • DELETE, Delete statement aims to delete data according to conditions. Refer to the section Delete according to conditions in this document for details.
  • MARK DELETE, Mark Delete conducts the function of delete data by line. Refer to the section Delete mark in this document for details.

This document mainly introduces two methods of DELETE and BATCH DELETE. Please refer to the corresponding command documents for other methods.

Delete according to conditions

Delete data according to conditions with DELETE command. Refer to DELETE command file for detailed instructions. Examples are as follows:

DELETE FROM example_table WHERE event_day < 20201001 AND event_key != 1000;

DELETE FROM example_table PARTITION p202010 WHERE event_key in (1000, 1001, 1002, 1003); 

DELETE command is a synchronous command, and a successful return means a successful deletion.

View the historical DELETE operation records through the following command:

mysql> SHOW DELETE FROM example_db;
+-----------+---------------+---------------------+-----------------+----------+
| TableName | PartitionName | CreateTime          | DeleteCondition | State    |
+-----------+---------------+---------------------+-----------------+----------+
| empty_tbl | p3            | 2020-04-15 23:09:35 | k1 EQ "1"       | FINISHED |
| test_tbl  | p4            | 2020-04-15 23:09:53 | k1 GT "80"      | FINISHED |
+-----------+---------------+---------------------+-----------------+----------+
2 rows in set (0.00 sec)

Refer to SHOW DELETE command file for specific instructions.

Points for attention

  • DELETE command is not suitable for high-frequency deletion operations, sending a large number of DELETE commands in a short time, for example, will seriously affect the underlying data merging and query efficiency. DELETE operation, in essence, is to store a deletion condition, which will be applied to filter every row of records during query, therefore, the query efficiency will be reduced when there are a large number of deletion conditions.
  • Try to avoid using the alternate execution mode ofDELETE-LOAD-DELETE-LOAD. This mode is rather unfriendly to the underlying data merging strategy, which may give rise to a large amount of unmerged data and cause backlog.

Mark delete

Mark delete function is mainly used in scenarios that need real-time update and synchronization like Synchronize MySQL Binlog data. This method can only be applied to the tables of UNIQUE KEY model. Refer to MARK DELETE document for details.

Previous
Data Update
Next
Mark Deletion