Reputation: 1
I am working with Azure Data Factory and have set up a data flow to combine datasets from two different sources: a SQL Database (SQLDBCOVID19Metrics_DL) and a SQL Server (SQLServerCOVID19Metrics_DL). Both sources contain data for 5 different countries. The intent is to use a Union activity to merge these datasets into one consolidated dataset, which then gets output to a sink (ODS_Cases_Sink).
In the ADF pipeline, I have:
Source activities that import data from both the SQL DB and SQL Server. A Union activity intended to combine rows from both sources. A Select activity that renames columns in preparation for the sink. The sink activity that should receive the unified dataset and export it. However, after the pipeline runs, the sink only contains the data from the SQL DB, with the SQL Server's data missing entirely.
I expected the Union activity to combine all rows from both sources, but it seems to only process the rows from the SQL DB. Here's what I've tried:
I've checked that both source activities are correctly configured and are able to fetch data independently. I confirmed that the Union activity's settings appear to include all columns from both datasets. I've reviewed the ADF monitoring logs, which show no errors or warnings. I've ensured that there's no row limitation in either the Union or Select activities that could inadvertently filter out the SQL Server data. I was expecting to see a combined dataset with records from both the SQL DB and SQL Server in the sink but ended up with only half of the expected data.
Could it be a schema mismatch issue, or is there something I'm missing in the Union activity configuration? How can I debug the Union activity in Azure Data Factory to find out why the SQL Server data is not included in the output?
Any assistance or suggestions would be greatly appreciated.
Here is what the data flow looks like:
Upvotes: 0
Views: 185
Reputation: 563
It surely seems like schema mismatch issue. Kindly check if both the datasets are having same schema ( make sure there is no spelling difference / case difference) in any of the columnnames.
Check the inspect tab and match the schemas. Try to re-import projections for both the datasets and make sure both are having same schema, if not then use select transformation to modify the non-matching column orders or column names.
It would be good if you can share the screenshot of inspect tab of both the source datasets.
Upvotes: 0