Local data to cloud
Migration of existing data to the cloud
Existing data refers to the data you have already generated up to this point. This type of data is usually large in volume, requires a significant amount of time for migration, and is generally fixed with infrequent modifications. When migrating local data to the cloud, you can choose between online and offline migration methods depending on the condition of your network.
Method 1: Online migration via BOSCMD tool
The online method for migrating local data to the cloud is best suited for cases where customers have good local network conditions with sufficient bandwidth. For such situations, it is recommended to use the object upload function of the BOS CMD tool to perform cloud data migration.
BOS CMD is a command-line tool provided by BOS, offering a wide range of functionalities to help users efficiently manage and operate BOS resources. It supports three object upload methods: batch upload, synchronous upload (sync), and single-file upload. Both batch upload and synchronous upload (sync) are effective for addressing the online migration of local stock data to the cloud. Here's a brief introduction to the batch upload method:
You can execute the following command on a machine with BOS CMD installed to use batch upload:
1$ bcecmd bos cp <local-path> <bos-path> --recursive [--storage-class [STORAGE_CLASS]] [--restart] [--quiet] [--yes] [--disable-bar]
Where local-path is the directory of the local files to be uploaded. For specific operations and precautions, please refer to: BOS CMD Object Upload Function
Method 2: Physical delivery with MoonBox
MoonBox is a terabyte-level data transmission solution provided by Baidu AI Cloud Object Storage (BOS). It utilizes physical storage devices to transfer large amounts of data between Baidu AI Cloud and customer IDCs via courier services. This method addresses challenges such as high network costs, prolonged transmission times, and data security risks, all while offering simplicity, efficiency, security, and cost-effectiveness.
Application scenarios
- Your network environment is poor (low bandwidth, unstable connection), and bandwidth costs are very high;
- Hope to complete the data migration to the cloud as soon as possible, but the cost of dedicated line is too high.
- If you do not have your own hard drives or removable data storage/transmission devices;
MoonBox offers mobile storage devices from BOS, featuring large capacity, intuitive operation, military-grade protection, and encryption. Each device is calibrated to hold up to 96 TB, with an actual usable capacity of approximately 83 TB, making it suitable for local IDC to cloud migration scenarios involving hundreds of terabytes of data.
Usage method
You can contact us via Ticket, or through your account manager. We will match you with appropriate usage time and duration based on the current device availability. Please inform us of your estimated data volume (e.g., 300 TB), the city where your IDC is located, and your expected usage time.
Product Introduction
For detailed introduction, operation procedures, and precautions of MoonBox, please refer to Introduction to MoonBox and MoonBox Operation Process
Method 3: Physical delivery with self-prepared hard drives
BOS also provides an offline method involving self-prepared hard disk delivery. Unlike "MoonBox," which supplies the physical devices, this method requires customers to provide their own hard drives. You can load data onto your hard drives, send them in batches to Baidu AI Cloud data centers, and the data will be transferred to the cloud free of charge. This method is suitable for large-scale local IDC-to-cloud data migration projects involving hundreds of terabytes. BOS has multiple dedicated units in its data centers that support simultaneous migration of up to 16 hard drives. Compared to "MoonBox," self-prepared hard drive delivery offers faster processing.
Application scenarios
This method is suitable for scenarios where customers have many hard drives and need urgent data migration to the cloud, such as:
- The on-premises IDC room needs to be decommissioned, and data needs to find a suitable storage location on the cloud as soon as possible;
- When customers have a significant amount of on-premises data to back up to the cloud quickly, and the dedicated line method proves too costly, they may seek more cost-effective alternatives.
Usage restrictions
- Only support national standard 3.5-inch or 2.5-inch mobile hard drives.
- This method only supports hard drives with USB interfaces. Many large-capacity 3.5-inch hard drives come equipped with SATA interfaces. Hence, you will need to purchase compatible hard drive enclosures and send both the enclosures and the hard drives to Baidu's data center. Appropriate hard drive enclosures can be purchased online by searching for "SATA to USB hard drive enclosure." Each hard drive requires its own enclosure, and you should purchase the number of enclosures necessary to match the number of hard disks. If there are fewer enclosures than hard drives, you'll need to wait for the first batch's data upload before proceeding with the next.
- Only ext4, ntfs, and xfs file systems are supported. You can refer to the Linux manual pages for more information on supported file systems: man 5 fs.
- After data copy is completed, all the hard drives and enclosures you sent will be returned to you. Before return: All data in the hard drives will be formatted. If the data is important, please back it up in advance;
- After the hard drives arrive at Baidu's data center, technicians will immediately perform a full disk check to see if there are bad sectors.If there are bad sectors that make data unreadable, the hard drive will be formatted directly and returned;
Usage fees
- The customer is responsible for all shipping and insurance costs associated with hard drive delivery, including sending the hard drives to Baidu's data center and their return after data upload. Once the upload is complete, BOS will return the hard drives via cash-on-delivery.
- It is recommended that you insure the hard drives;
- Fees for data center server usage, network bandwidth, and technical support services incurred during data upload are all free;
Usage process
Step 1: If you use the self-prepared hard drive delivery method for cloud migration, first contact us via Ticket or your account manager. Provide information including your data volume (e.g., 30 TB in total), number of hard drives (e.g., 30 mobile hard drives), whether hard drive enclosures are included (for SATA interface hard drives, please prepare enclosures in advance), the expected arrival time of the hard drives at Baidu’s data center, and the expected upload completion time;
Step 2: Copy data to the hard drives, and conduct data checks and back up important data;
Step 3: Send the hard drives and enclosures (if included) to the following address:
| Address | Sinnet Internet Data Center, No. 37 Guangmao Road, Doudian Town, Fangshan District, Beijing |
|---|---|
| Telephone | 16619934602 |
| Contact person | BJDD computer room |
Step 4: Inform us of the name of the bucket you need to upload to, the expected directory structure, and your AK/SK via your account manager or Ticket. We will perform the data upload operation for you. It is recommended that you create the bucket and subdirectories in advance by yourself, and create a new AK/SK (do not use the default AK/SK);
Step 5: BOS technicians will receive the hard drives, check them, and start data upload if everything is correct. After upload is completed, we will confirm the number of files with you via ticket or your account manager. After confirmation, all data in the hard drives will be formatted, and then the hard drives and enclosures (if included) will be returned to you;
Step 6: After upload is completed, you can manage and use the data via the BOS console, API, SDK, etc., and delete the AK/SK provided earlier;
Migration of incremental data to the cloud
Incremental data refers to new or future data that your business is generating. It is typically real-time data with dynamic characteristics.
Method 1: Online migration via BOSCMD tool
The synchronous upload (sync) feature of BOS CMD provides an efficient solution for dealing with incremental data during local-to-cloud data migration. Supporting batch operations by default, it allows users to sync local directories with BOS. If a file with the same name exists on BOS and has a newer modification time than the local file, the sync feature skips uploading the file and only handles new or updated files. The sync command compares files on both local and BOS sides, taking appropriate actions for varying circumstances to ensure accurate and secure incremental data uploads.
You can execute the following command on a machine with BOS CMD installed to use synchronous upload (sync):
1$ bcecmd bos sync <local_dir> bos:/<bucket_name>/[prefix] [--exclude EXCLUDE] [--include INCLUDE] [--delete] [--exclude-delete EXCLUDE-DELETE] [--dryrun] [--yes] [--quiet] [--storage-class STORAGE-CLASS] [--sync-type SYNC-TYPE] [--concurrency CONCURRENCY] [--restart]
Where local_DIR is the local synchronization directory. For specific operations and precautions, please refer to: BOS CMD Synchronous Upload (sync) Function
