百度智能云

All Product Document

          Database Transmission Server

          Use SQL Server Database as DTS Source

          1. Applicable Scenarios

          This article is applicable for using the Baidu AI Cloud Data Transmission Service(hereinafter referred to as DTS), to support the data migration task of the self-built SQL Server database as the source.

          2. Migration Preconditions

          2.1 Database account permission

          Database Structure migration and full data migration Incremental migration
          Self-built SQL Server db_datareader permission at least db_owner permission at least

          2.2 Prerequisites for incremental data migration task

          • The database recovery mode is set to Full.
          • Incremental migration tasks use the Change Data Capture (CDC) function, and different versions have limitations on the CDC function.
          Version Enterprise Standard
          12 Support
          14 Support
          16 Support Support
          17 Support Support
          19 Support Support

          3. Restrictions on Using SQL Server Database as Source

          3.1 Restriction on structural migration task

          • Structure migration of homogeneous database allows a table, view, custom data type, function, storage procedure, and trigger for SQL Server→SQL Server.
          • heterogeneous databases migration only allows table.

          3.2 Restriction on full data migration task

          • If the primary key or unique constraint does not exist for the table to migrate in the source database and all fields are not unique, there may be duplicate data in the target database.
          • The data type MONEY and SMALLMONEY only support two decimal places.

          3.3 Restriction on incremental data migration task

          • SQL Server needs the support of SQL Server proxy service to enable the CDC function.
          • If the CDC schema or database user exists in a database currently, you cannot enable the change data capture for this database before you delete or rename this schema or user.
          • To execute DDL for the source table with the CDC function enabled, it can be operated by the member of sysadmin and database role db_owner or the member of database role db_ddladmin only.
          • The account used to enable the database level CDC function must be a member of the role sysadmin. The account used to enable the table level CDC function must be a member of the role sysadmin or db_owner.
          • The incremental change to the column set is not supported.
          • The incremental change to the computed column is not supported.
          • The data type sql_variant, cursor, and table are not supported.
          • The DELETE operation for incremental migration of tables without a primary key is not supported.
          • The data type MONEY and SMALLMONEY only support two decimal places.

          4. Using SQL Server Database as Source

          For the operation process of task creation, task configuration, pre-check, task start, task pause, and task termination by using the SQL Server database as the source, see the Best Practice Documentations. It is slightly different from other data sources in the task configuration parameters and object mapping parts.

          4.1 Task configuration parameters

          image.png

          as shown in the figure above. At present, the DTS source supports the self-built SQL server instances of a public network. The configuration parameters of SQL Server source are as follows:

          • Access type: Support the self-built SQL Server instances of public network/BCC/BBC/DCC
          • Data type: Select the SQL Server fixedly.
          • IP/port: Access IP and service port of self-built SQL Server database
          • Database: Database name to migrate in self-built SQL Server database
          • Account: Database account of self-built SQL Server
          • Password: Password corresponding to this database account

          4.2 Object mapping (heterogeneous migration)

          SQL Server is a three-level schema. DTS provides two modes of database table name mapping for a user to choose from when the user does the data migration to the destination of two levels of a schema. As shown in the figure below, the selected migration object appears in the selected object list on the right. DTS supports the upstream and downstream database table name mapping and column filtration blacklist/whitelist and other functions. You can click “Edit” to configure the mapping and filtration rules for each migration object. The target database object can be mapped to the database and schema in the source database:

          1. Database objects are mapped to the database in the source database: Ignore the schema in SQL Server. The tables in different schemas of SQL Server are mapped to the specified MySQL database, and the target database name uses the source database name by default.
          2. The database object is mapped to the database and Schema in source database: Ignore the database name in SQL Server. Different schemas in SQL Server are mapped to different databases in MySQL, and the target database name uses the source schema name by default. On completion of object mapping configuration, click “Save and Pre-check” to start the pre-check of tasks

          5. Using CDC in SQL Server Database

          Before you use the CDC function, you need to start the database proxy service. If you use the RDS for SQL Server, it may cause startup failure. The solution is as follows:

          sp_configure 'show advanced options', 1; 
          reconfigure; 
          sp_configure 'Agent XPs', 1; 
          reconfigure; 
          • The error log in the modified agent properties is: C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Log\SQLAGENT.OUT, or other existing paths.
          • Modify the properties of the local service "SQL Server Agent" and select the login identity as "Local System Account".
          • Then start the local service, and start SQL Server Agent in SSMS.

          5.1 Start the Database Level CDC Function

          -- Start database level CDC command 
          EXEC sys.sp_cdc_enable_db; 
          -- Check if the startup is successful 
          SELECT name,is_cdc_enabled FROM sys.databases WHERE name = 'DB_name' 

          If you are using RDS, an error may occur:

          Could not update the metadata that indicates database DB_name is enabled for Change Data Capture. The failure occurred when executing the command 'SetCDCTracked(Value = 1)'. The error returned was 15404: 'Could not obtain information about Windows NT group/user 'RDS-WIN-TEST\Administrator', error code 0x534.'. Use the action and error to determine the cause of the failure and resubmit the request. 

          The solution is to execute the following command and then execute the start database level CDC command again:

          ALTER AUTHORIZATION ON DATABASE::[DB_name] TO [sa] 
          or 
          EXEC sp_changedbowner 'sa' 

          5.2 Start each Table-level CDC Function that Requires Incremental Migration of Tables

          -- Start table level CDC command 
          EXEC sys.sp_cdc_enable_table 
          @source_schema= N'schema_name',--- the name of the schema to which the source table belongs. There is no default value, and it cannot be NULL 
          @source_name = N'table_name',-- the name of the source table. There is no default value, and it cannot be NULL 
          @role_name = NULL,-- It is recommended to set to NULL, the name of the database role used to access the changed data. It must be specified. If it is explicitly set to NULL, there is no control role to restrict access to the changed data. It can be an existing fixed server role or database role. If the specified role does not exist, a database role with that name will be created automatically 
          @capture_instance = DEFAULT,-- The name of the capture instance used to name the instance-specific change data capture object. And it cannot be NULL, and the source table can have up to two capture instances. 
          @supports_net_changes = 0, -- Whether to enable support for querying net changes for this capture instance, the default value is 1. If supports_net_changes is set to 1, index_name  must be specified, otherwise the source table must have a defined primary key 
          @index_name = NULL,-- The name of the unique index that identifies the row in the source table, which can be NULL. If not used, CDC will use the primary key, if the table does not have a primary key, it will ignore the primary key added later 
          @captured_column_list = NULL,-- identifies the source table columns to be included in the change table, NULL means all columns will be included in the change table; for a comma-separated list of column names, you can choose to put a single column name in double quotation marks (") or square brackets ([]), which cannot contain the following reserved column names: __ $ start_lsn, __ $ end_lsn, __ $ seqval, __ $ operation, and __ $ update_mask. 
          @filegroup_name = DEFAULT, - the file group to be used for the change table created for the capture instance. If it is NULL, the default file group is used. It is recommended to create a separate file group for the change table for change data capture. 
          @allow_partition_switch = 1 - Whether it is possible to execute the SWITCH PARTITION command of ALTER TABLE on tables with change data capture enabled, the default value is 1. For non-partitioned tables, this switch setting is always 1, and the actual setting is ignored. 

          If you are using RDS for SQL Server, an error may occur:

          Message 22832, Level 16, State 1, Procedure sp_cdc_enable_table_internal, line 623 
          Unable to update metadata to indicate that change data capture is enabled for table [aqadmin]. [AQ_TEST_1]. Failed whn executing the command'[sys]. [sp_cdc_add_job] @job_type = N'capture''. The error returned was 22836:'Unable to update the metadata of database aq11 to indicate that a change data capture job has been added. Failed when executing the command'sp_add_jobstep_internal'. The error returned is 14234:'The specified @server is invalid (valid value is returned by sp_helpserver). '。。 Please use this action and error to determine the cause of the failure and resubmit the request. '。。 Please use this action and error to determine the cause of the failure and resubmit the request. 

          The host name is modified after SQL Server is installed, resulting in inconsistent "servname" in the two statements' results. The solution is to execute the following command and then execute the start table level CDC command again:

          IF serverproperty('servername')<>@@servername 
            BEGIN 
            DECLARE  @server SYSNAME  
            SET   @server=@@servername      
            EXEC  sp_dropserver @server=@server    
            SET   @server=cast(serverproperty('servername') AS SYSNAME)   
            EXEC  sp_addserver @server=@server,@local='LOCAL'  
            PRINT 'ok!' 
            END 
          ELSE  
            PRINT 'undo! 

          The first time you start the table CDC function, it will prompt to start two jobs: Capture job and clear job.

          5.3 Modify the Execution Cycle of the Cleanup job (optional)

          By default, the captured incremental data will be cleared after 4320 (minutes). You can modify the retention time of incremental data. The following command will modify the retention time to 129600 (minutes).

          EXECUTE sys.sp_cdc_change_job 
          @job_type = N'cleanup', 
          @retention = 129600; 
          
          -- Restart 
          EXECUTE sys.sp_cdc_start_job 
          @job_type = N'cleanup'; 

          5.4 Modify the Period for the Capture Job to Read the Transaction Log (optional)

          The default interval is 5 (seconds). The longer the interval, the longer it takes for incremental data to be recorded in the change table, so you can adjust the interval time appropriately. The following command changes the period interval to 1 (second).

          EXECUTE sys.sp_cdc_change_job 
          @job_type = N'capture', 
          @pollinginterval = 1; 
          
          -- Restart 
          EXECUTE sys.sp_cdc_start_job 
          @job_type = N'capture'; 

          6. Source Data Type of SQL Server

          Data migration that uses SQL Server as a DTS source supports most SQL Server data types. The following table lists the SQL Server source data types supported when using DTS and the default mapping from DTS data types.

          SQL Server data types DTS data type
          BIGINT DTS_TYPE_INT8
          INT DTS_TYPE_INT4
          SMALLINT DTS_TYPE_INT2
          TINYINT DTS_TYPE_INT2
          BIT DTS_TYPE_INT1
          DECIMAL (p,s) DTS_TYPE_NUMBER
          NUMERIC (p,s) DTS_TYPE_NUMBER
          MONEY DTS_TYPE_NUMBER
          SMALLMONEY DTS_TYPE_NUMBER
          REAL DTS_TYPE_FLOAT
          FLOAT(N) DTS_TYPE_DOUBLE
          DATETIME DTS_TYPE_DATETIME
          DATETIME2 DTS_TYPE_DATETIME
          SMALLDATETIME DTS_TYPE_DATETIME
          DATE DTS_TYPE_DATE
          TIME DTS_TYPE_TIME
          DATETIMEOFFSET DTS_TYPE_STRING
          CHAR(N) DTS_TYPE_STRING
          VARCHAR(N) DTS_TYPE_STRING
          VARCHAR(max) DTS_TYPE_STRING
          TEXT DTS_TYPE_STRING
          NCHAR(N) DTS_TYPE_STRING
          NVARCHAR(N) DTS_TYPE_STRING
          NVARCHAR(max) DTS_TYPE_STRING
          NTEXT DTS_TYPE_STRING
          BINARY(N) DTS_TYPE_BYTES
          IMAGE DTS_TYPE_BYTES
          VARBINARY(N) DTS_TYPE_BYTES
          VARBINARY(max) DTS_TYPE_BYTES
          TIMESTAMP DTS_TYPE_BYTES
          UNIQUEIDENTIFIER DTS_TYPE_STRING
          HIERARCHYID DTS_TYPE_STRING
          XML DTS_TYPE_STRING
          GEOMETRY DTS_TYPE_STRING
          GEOGRAPHY DTS_TYPE_STRING
          ROWVERSION DTS_TYPE_BYTES

          Tables containing fields of the following data types are not supported:

          • CURSOR
          • SQL_VARIANT
          • TABLE
          Previous
          Use PostgreSQL Database as DTS Source
          Next
          Oracle as Source