How to make ADF copy data activity convert "" into null?

I am running into a problem trying to do a shuttle from a file in Blob to my SQL database.

My file in Blob is a .CSV, and it contains text qualifiers (") double quotes.

One of the file columns is a date column and it is very sparse. Where it not contain data it is only "".

My target column in the table in the SQL database is [datetime null].

My pipeline has one a copy activity, the source dataset is the csv pointing to the Blob and the sink pointing to the SQL database.

I am using in the Source configuration the quotation character ("), so it is reading correctly.

But when I debug, it fails the shuttle. The error message say that was not possible the convert the "" (empty string) to datetime (target column)

But, if remove the qualifiers (") from the source table and try again, now the shuttle succeed and it converts now the empty into null.

What can I do? Removing the qualifiers is not an option.

I could change the target column datatype to varchar, and do a cleanup in a second step, but this also is not desirable.

My goal is to this shuttle, keeping the qualifiers and the target column as datetime data type.

Thanks in advance

Upvotes: 0

Views: 1981

Answers (1)

Saideep Arikontham
Saideep Arikontham

Reputation: 6114

  • You can directly choose what value you want to be considered as null from the data in the dataset configuration. The following is the demonstration of the same.

  • I have a csv file with following data:

"id","gname","dt"
"1","Ceb","2023-04-15"
"2","Ana",""
"3","","2023-04-01"
  • When I try to insert this data into the table, I get the same error:

enter image description here

  • Using @string('') in Null value field of copy data source dataset configuration also gave the same error, as the defualt value for this field in empty string itself.

enter image description here

  • Using lookup, it can be seen that the empty string will be considered as null itself (not sure why copy data does not recognize it). One alternative is to use azure dataflows directly to copy the data.

enter image description here

  • Without any additional changes, using just source and sink, I was able to insert the data without any errors.

enter image description here

  • Another way is to use OPENJSON on lookup activity output in a script activity. But if your source has more than 5000 rows, it does not work. The following is the query after using look up:
INSERT INTO t1 SELECT * FROM OPENJSON('@{activity('Lookup1').output.value}') WITH (id varchar(100), gname VARCHAR(100),dt datetime)

enter image description here

Upvotes: 0

Related Questions