Ivan Nel
Ivan Nel

Reputation: 325

ADF Default Columns

ADF Copy task:

Importing flat files with wildcard *.txt, some files have 18 cols, some have 24. SQL table sink has 24 cols.

Fails because it does not find a mapping for cols 19-24. Can i default the mapping of the last 6 cols to NULL when no value is found ?

EDIT:

I copied my source to blob and used a dataflow with schema drift instead. I can connect to my source and can see that it writes parquet files to the staging folder, but after calculating the rows the workflow fails with error:

Operation on target Dataflow1 failed: {"StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: at Sink 'nsodevsynapse': Unable to stage data before write. Check configuration/credentials of storage","Details":"org.apache.hadoop.fs.azure.AzureException: com.microsoft.azure.storage.StorageException: This operation is not permitted on a non-empty directory.\n\tat org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.delete(AzureNativeFileSystemStore.java:2607)\n\tat org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.delete(AzureNativeFileSystemStore.java:2617)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem.deleteFile(NativeAzureFileSystem.java:2657)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem$2.execute(NativeAzureFileSystem.java:2391)\n\tat org.apache.hadoop.fs.azure.AzureFileSystemThreadPoolExecutor.executeParallel(AzureFileSystemThreadPoolExecutor.java:223)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem.deleteWithoutAuth(NativeAzureFileSystem.java:2403)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem.delete(NativeAzureFileSystem.java:2453)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem.delete(NativeAzureFileSystem.java:1936)\n\tat org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter."}

My sink is using a sql account to connect, i can connect to sql using that account. I can write edit SQL tables using that account.

the managed instance has owner permissions on the storage account.

Upvotes: 0

Views: 667

Answers (1)

NiharikaMoola
NiharikaMoola

Reputation: 5074

You can load data using the dataflow activity by enabling “Allow schema drift” in source and sink transformations and it will automatically default the values to NULL when not passed.

Source files:

enter image description here enter image description here

Dataflow:

• In source and Sink, enable "Allow schema drift" if the source schema changes often.

enter image description here

enter image description here

Add mapping to map all source columns to destination.

enter image description here

Destination SQL table:

enter image description here

Upvotes: 1

Related Questions