Data synchronization
Overview
Data replication is an automatic and asynchronous process of copying files (objects) between storage spaces (buckets) within a BOS data center (region). It ensures replication of object creation, updates, and deletions from the source storage space to the target storage space.
The data synchronization feature effectively supports cross-region, cross-account, and other bucket replication needs. The objects in the target bucket are exact replicas of those in the source bucket, with identical object names, metadata, and content, such as creation time, owner, custom metadata, object ACL, and object content.
Application scenarios
You may configure data synchronization for a bucket for various reasons, including:
- Efficient Access & Latency Reduction: When accessing a bucket or object across regions, the access speed may decrease due to geographical differences. The data synchronization function allows pre-synchronization of required data to the target region, to improve access efficiency.
- Cross-region disaster recovery: To meet compliance requirements in finance and government, etc., multiple data replicas are maintained within the same region. However, to prevent data center-level disasters, such as floods and earthquakes, it is also required to maintain a replica of data across regions. The data synchronization function of BOS provides this capability.
- Cross-Account Replication: Suitable for multi-account scenarios in enterprises. For data security and backup considerations, enterprises may want to periodically sync data from Account A to Account B to replicate and share cross-account data.
- Cross-region data reuse: Due to business requirements, such as computing clusters built across regions needing to use the same set of data, the BOS data synchronization function can be used to replicate data copies.
Operation types
BOS supports cross-region replication via both the console and SDK. Details are as follows:
- [Data synchronization via console](BOS/Console Operation Guide/Managing Bucket/Set data synchronization.md)
-
Data synchronization via API
-
Data synchronization via SDK:
Instructions for use
- Users can define a file name prefix to specify the data to be synchronized in the source bucket or choose to sync all data from the source bucket.
- The storage class of the target bucket can be set to match that of the source objects or selected differently based on specific needs.
- Conducting data replication on non-standard objects incurs restoration fees.
- Files in the target bucket cannot be stored in the archive storage class.
- Once a rule is successfully added, users can view all existing synchronization policies for the current bucket in the list, and also choose to edit or delete them.
- When historical file replication is enabled, historical objects will comply with the prefix rules in incremental data, and only files under the same prefix will be synchronized. If you choose to synchronize all data in the bucket, all historical data will also be synchronized. If the rules are modified, the already synchronized historical data will be re-synchronized.
- The two buckets for data synchronization can be cross-region or same-region. Only data synchronization between cross-region buckets will trigger traffic fees; data synchronization between buckets in the same region will not incur traffic fees.
- The two buckets for data synchronization can be cross-account or same-account. Please carefully confirm that the entered target bucket name is accurate. If the target bucket in the destination account is located in another region, cross-region traffic fees will be generated. The source account shall bear the traffic fees, while the destination account shall bear the storage and request fees.
-
It supports multi-rule synchronization between buckets, that is, one source bucket can have multiple destination buckets, and one destination bucket can have multiple source buckets. Example with three buckets (named A, B, C):
- It is supported to set A as the source bucket of B, and B as the source bucket of C at the same time;
- It is supported to set A as the source bucket of B, and B as the source bucket of A at the same time;
- It is supported to set A as the source bucket of both B and C at the same time;
- It is possible to designate C as the target bucket for both A and B simultaneously.
- For data security reasons, BOS will not obtain or return the activation status of cross-account synchronization. If you have set up and enabled cross-account synchronization, please proactively check whether data synchronization has been enabled in the destination account. If the source data has not been synchronized to the destination account’s bucket within 5 minutes, please check again whether the Write Permission is enabled. If synchronization still fails after write permission is enabled, please submit a Ticket.
Usage restrictions
- For buckets in a synchronization relationship, objects copied from the source bucket can overwrite objects with the same name in the target bucket. Please proceed cautiously.
- Since data synchronization uses asynchronous replication, there will be a delay in data appearing in the target bucket, typically from several minutes to several hours depending on the data size.
Viewing synchronization progress
After configuring the data synchronization rule, you can click the Preview button in the Operations section of the synchronization rule on the console to check the progress of historical data synchronization, the real-time synchronization time point, and detailed configuration information for the data synchronization rule.
- Historical Data Synchronization: You can monitor the progress of historical file synchronization to check the status.
- Real-Time Data Synchronization: You can view timestamps for the latest incremental file syncs to stay updated on real-time synchronization.
For detailed operations, please refer to Console Data Synchronization Progress Query.
Note
Cross-border data compliance commitment
When you use this service or feature, your business data on the cloud will be transmitted to the selected region or the region where the service is deployed. This may involve cross-border data transfer.
By using this feature, you confirm that you have full legal authority to manage the relevant business data and that you accept complete responsibility for its transmission and associated actions. You also confirm that your data transmission complies with relevant laws and regulations, including obtaining explicit consent from data subjects, completing cross-border data security assessments, and signing standard contracts for cross-border personal information transfers with recipients. You further guarantee that your data does not include any content restricted or prohibited by applicable laws. For specific compliance requirements, please consult the appropriate authorities.
In the event of non-compliance with the above commitments, you will bear all legal consequences and compensate Baidu AI Cloud and/or its affiliates for any resulting losses.
If Baidu AI Cloud is required to modify or discontinue this feature due to changes in law, policy, or regulations, you understand and accept that this will not constitute a breach of contract. Baidu AI Cloud will provide transitional assistance for any service adjustments or termination.
