Reputation: 325
ADF Copy task:
Importing flat files with wildcard *.txt, some files have 18 cols, some have 24. SQL table sink has 24 cols.
Fails because it does not find a mapping for cols 19-24. Can i default the mapping of the last 6 cols to NULL when no value is found ?
EDIT:
I copied my source to blob and used a dataflow with schema drift instead. I can connect to my source and can see that it writes parquet files to the staging folder, but after calculating the rows the workflow fails with error:
Operation on target Dataflow1 failed: {"StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: at Sink 'nsodevsynapse': Unable to stage data before write. Check configuration/credentials of storage","Details":"org.apache.hadoop.fs.azure.AzureException: com.microsoft.azure.storage.StorageException: This operation is not permitted on a non-empty directory.\n\tat org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.delete(AzureNativeFileSystemStore.java:2607)\n\tat org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.delete(AzureNativeFileSystemStore.java:2617)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem.deleteFile(NativeAzureFileSystem.java:2657)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem$2.execute(NativeAzureFileSystem.java:2391)\n\tat org.apache.hadoop.fs.azure.AzureFileSystemThreadPoolExecutor.executeParallel(AzureFileSystemThreadPoolExecutor.java:223)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem.deleteWithoutAuth(NativeAzureFileSystem.java:2403)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem.delete(NativeAzureFileSystem.java:2453)\n\tat org.apache.hadoop.fs.azure.NativeAzureFileSystem.delete(NativeAzureFileSystem.java:1936)\n\tat org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter."}
My sink is using a sql account to connect, i can connect to sql using that account. I can write edit SQL tables using that account.
the managed instance has owner permissions on the storage account.
Upvotes: 0
Views: 667
Reputation: 5074
You can load data using the dataflow
activity by enabling “Allow schema drift” in source and sink transformations and it will automatically default the values to NULL when not passed.
Source files:
Dataflow:
• In source
and Sink
, enable "Allow schema drift" if the source schema changes often.
• Add mapping to map all source columns to destination.
Destination SQL table:
Upvotes: 1