Reputation: 133
I thought I had a simple question, but I am having a hard time finding an answer. This makes me suspicious that I'm asking the wrong question...
I am new to Redshift... I am using a DMS migration task to pull data from a Sql Server database sitting on an EC2 instance into my Redshift database. I have it set to do a Full Load with Ongoing Replication. This is working.
However, I want to know specifically which rows have changed after ongoing replication makes its updates. It is replicating to my staging tables, and I do further transformations from there depending on certain changes to columns (eg history tracking), which is why I need to know what changed. I compare the staging tables to the existing facts and dimensions, but I don't want to compare the entire table, just the modified rows.
The source database is older and I can't trust the modification timestamp columns are always updated. I thought that setting the migration task to truncate the table, then ingesting ongoing changes would leave my staging table with just changed rows. In hindsight, maybe that was a silly thought.
The other route I am thinking is to turn on CDC in the source tables, load staging tables on the Sql Server side with the net changes and put DMS on those tables instead. I was hoping that extra step would not be necessary.
Help is appreciated!
Upvotes: 0
Views: 732
Reputation: 26
Not sure if you're still looking for a answer on this but you could always use a transformation rule to flag rows that have changed from dms. e.g.
{
"rule-type": "transformation",
"rule-id": "1",
"rule-name": "1",
"rule-target": "column",
"object-locator": {
"schema-name": "%",
"table-name": "%"
},
"rule-action": "add-column",
"value": "dms_load_ts",
"expression": "current_timestamp",
"data-type": {
"type": "datetime"
}
}
Upvotes: 1