Deffo
Deffo

Reputation: 197

Azure Data Factory dynamic excel data flow

my scenario is the following: I created a pipeline that work on an Excel file and is triggered by the upload of the Excel in a Blob Storage. The filename is passed as a parameter to the pipeline from the trigger. I then apply a lookup and then a forEach where I copy each row of the excel to different SQL tables. Now I have two problems:

  1. Sometimes some rows are deleted manually from the excel before uploading, but even if they are empty the Lookup activity reads them as rows full of null, while I would like to skip them. I tried to filter them out with a data flow, but I do not understand how to configure it because nothing appears in the dropdown menus and I don't know what to write as a dynamic content and how to write it. For example, I tried to use a Select to rename the columns: select columns but then in the filter activity nothing appears in the Input schema section so I don't know how to make it work filter details

  2. In the last part of my pipeline, when I copy data to SQL table I do it with a json mapping:

    @json(' { "type": "TabularTranslator", "mappings": [ { "source": { "name": "sampleId" }, "sink": { "name": "sample_id", "physicalType": "int" } }, { "source": { "name": "parameterId" }, "sink": { "name": "parameter_id", "physicalType": "int" } }, { "source": { "name": "resultText" }, "sink": { "name": "result_text", "physicalType": "varchar", "nullValue": "null" } }, { "source": { "name": "resultSymbol" }, "sink": { "name": "result_symbol", "physicalType": "varchar", "nullValue": "null" } }, { "source": { "name": "resultNumber" }, "sink": { "name": "result_number", "physicalType": "decimal", "nullValue": "null" } } ], "typeConversion": true, "typeConversionSettings": { "allowDataTruncation": true, "treatBooleanAsNumber": false } } ')

    But even if the result_number column is nullable and the input contains a null, I get an error: ErrorCode=TypeConversionFailure,Exception occurred when converting value '' for column name 'result_number' from type 'String' (precision:, scale:) to type 'Decimal' (precision:10, scale:2). Additional info: The input wasn't in a correct format.

    This is what I see in the input tooltip:

    input tooltip

    How can I make it insert a NULL in the db instead of failing?

Thank you

EDIT1

Some more details on my pipeline. This is where I think I should put the dataflow: complete pipeline And here's the details about the dataset: dynamic excel lookup

dynamic excel lookup parameters

Upvotes: 0

Views: 100

Answers (1)

Pratik Lad
Pratik Lad

Reputation: 8291

Sometimes some rows are deleted manually from the excel before uploading, but even if they are empty the Lookup activity reads them as rows full of null, while I would like to skip them.

To skip the rows with full of null You can use filter activity as below:

  • Take filer activity and pass the lookup output to it as items as below. enter image description here
  • Thena add condition as below where we are comparing each item if it is null or not if all items are null then row would get skipped;
@or(or(or(or(not(equals(item().sampleld,null)),not(equals(item().parameterld,null))),not(equals(item().resultText,null))),not(equals(item().resultSymbol,null))),not(equals(item().resultNumber,null)))

enter image description here

ErrorCode=TypeConversionFailure,Exception occurred when converting value '' for column name 'result_number' from type 'String' (precision:, scale:) to type 'Decimal' (precision:10, scale:2). Additional info: The input wasn't in a correct format.

In Copy activity, Value in additional column can be of string only. It does not accept null value there is the cause of error you are getting.

You can use DataFlow instead of Copy activity to resolve the issue

  • First create dummy file with your data in single row and add it as source in dataflow. enter image description here My source Data Preview: enter image description here
  • Create parameters in the data flow with default value as null() as per your respective datatype as below: enter image description here
  • Then pass the values from foreach to this parameter in dataflow activity parameter as below: enter image description here
  • Then in Dataflow take Derived column transformation and create columns with expressions using the above parameters as below image and expressions:
resultText1 - iif(equals($resultTextparam,''), toString(null()), $resultTextparam)
resultSymbol1 - iif(equals($resultSymbolparam,''), toString(null()), $resultSymbolparam)

enter image description here

  • Then take select column transformation and select only the columns you created in derived column and rename it to original column name:enter image description here
  • In sink map the column properly and copy the data: enter image description here

Final output:

enter image description here

Upvotes: 0

Related Questions