Ajinkya Pujari
Ajinkya Pujari

Reputation: 71

i have enabled CDC on few non-empty tables , are the CDC system tables under cdc. schema remains empty till i do DDL/DML on the base user tables?

i am using ssis job to fetch CDC data from source to destination based on lsn value from the system-created cdc tables(with suffix _CT).

when i enabled the CDC , the base tables already had many rows.

Now , in the first ssis run i need to pull all existing data based on lsn before i start doing DML operations, but all cdc tables are empty.

so the question is if we enable cdc on a non empty table , wont it have the cdc data in the cdc system tables before we start changing the data ?

Upvotes: 0

Views: 981

Answers (1)

Ben Thul
Ben Thul

Reputation: 32707

The short answer is: no - the CDC table won't have a copy of the initial data. The CDC tables are populated by reading changes (inserts, updates, deletes) from the transaction log and inserting them into the CDC tables.

I would use either a data snapshot or backup to initialize your data. Specifically:

  1. Enable CDC on your table(s)
  2. Create a database backup and restore it somewhere or take a database snapshot
    • If using a database backup, take note of the checkpoint LSN; you can find this in msdb.dbo.backupset
    • If using a database snapshot, take note of the LSN of the snapshot; you can find this using DBCC DBINFO(DATABASENAME) WITH TABLERESULTS; (all credit to this link)
  3. Do a bulk load from the database that you created in the previous step to your eventual destination.
  4. For the next (incremental) load from the CDC data, you can use the LSN from the previous step to pass to the cdc.fn_cdc_get_all_changes_<capture_instance>() or cdc.fn_cdc_get_net_changes_<capture_instance>() functions. This will give you changes that happened from the point-in-time represented by the backup or snapshot that you used to do the initial load.

Upvotes: 1

Related Questions