mkn
mkn

Reputation: 27

AzureDataFactory-DataFlow-Sink DeltaTable to AzureSQLDatabase with DeltaLake time travel via version

The purpose of the exercise is to read a Delta_log file with ADF PipeLine, from which I take the Version and the output from 'Lookup' activity, to store it in Set Variable Activity. Then I want to submit an output from 'Set Variable' as a variable to DataFlow Source and finally to sink the information found based on the Version in a specific table in AzureSQLDatabase. See the process below:

enter image description here

Lookup - source dataset is JSON - delta_log file

Set variable - there is a variable deltatableVersion with type Integer, which is set in the 'Set variable' and in the expression builder the content is: @int(activity('Lookup1').output.value[0]['version'])

this is the result from 'Set variable'

DataFlow activity configuration:

enter image description here enter image description here

And in the DataFlow we have Source and Sink with Parameter deltatableVersion type Integer.

enter image description here

In the Source activity the Inline dataset type is Delta and the linked service connection is ADLS Gen2 where there are many parquet files (between 100-200) enter image description here

And in the Source Options this is what we have: enter image description here

And the Sink type is 'Dataset' - a table in AzureSQLDatabase with the following congfiguration:

enter image description here

enter image description here

And I'm getting these errors:

enter image description here

And this one: enter image description here

Upvotes: 0

Views: 93

Answers (1)

Pratik Lad
Pratik Lad

Reputation: 8291

Sink DeltaTable to AzureSQLDatabase with DeltaLake time travel via version

I tried the same and it's working fine for me here are the steps I followed:

  • First In data flow I created parameter with name version_no and type Integer. enter image description here
  • Then passed this parameter in Source setting >> version as below enter image description here
  • In data flow parameters passing value of variable. enter image description here

The error or issue you are facing is might be caused by the cluster running out of disk space. please retry using an integration runtime with bigger core count and/or memory optimized compute type. enter image description here

If the above does not help, please raise support ticket.

Upvotes: 0

Related Questions