Reputation: 27
I’m currently creating Data flow with a Derived column that has a Rounding transformation through DontNet.SDK. The Source and Sink datasets I am using are parameterized. And I’m assigning values for them at runtime through pipeline parameters. Please refer below json files of two data flows.
I have a scenario which is rounding salary in to two decimal points from three decimal points. When I created this manually in ADF it’s successfully rounding. Below is the output result file of transformation
But when I am creating this using .net SDK, it’s not working. I’m not getting the column name as expected but the value is coming correctly. Below is the .SDK output
Below is the Json format of Data Flow I created
{
"name": "Rounding_Auto__Transformation",
"properties": {
"type": "MappingDataFlow",
"typeProperties": {
"sources": [
{
"dataset": {
"referenceName": "defaultdataflowSourcedataset",
"type": "DatasetReference"
},
"name": "source"
}
],
"sinks": [
{
"dataset": {
"referenceName": "defaultdataflowSinkdataset",
"type": "DatasetReference"
},
"name": "sink"
}
],
"transformations": [
{
"name": "DerivedColumn0"
}
],
"script": "source(output(\n\t\tid as string,\n\t\tsal as string,\n\t\tgender as string,\n\t\tname as string,\n\t\tisMarried as string,\n\t\ttags as string,\n\t\taddress as string\n\t),\n\tallowSchemaDrift: true,\n\tvalidateSchema: false,\n\tignoreNoFilesFound: false) ~> source\nsource derive(NewSal = round(toFloat(sal),2,2)) ~> DerivedColumn0\nDerivedColumn0 sink(allowSchemaDrift: true,\n\tvalidateSchema: false,\n\tpartitionFileNames:['customer_post_with_round.csv'],\n\tpartitionBy('hash', 1),\n\tskipDuplicateMapInputs: true,\n\tskipDuplicateMapOutputs: true) ~> sink"
}
}
}
I also compared the json created for the manual (as it works) in ADF directly– here is the one for manual
{
"name": "Rounding_Manually",
"properties": {
"type": "MappingDataFlow",
"typeProperties": {
"sources": [
{
"dataset": {
"referenceName": "SourcDS",
"type": "DatasetReference"
},
"name": "source1"
}
],
"sinks": [
{
"dataset": {
"referenceName": "SinkDS",
"type": "DatasetReference"
},
"name": "sink1"
}
],
"transformations": [
{
"name": "DerivedColumn1"
}
],
"script": "source(output(\n\t\tid as string,\n\t\tsal as string,\n\t\tgender as string,\n\t\tname as string,\n\t\tisMarried as string,\n\t\ttags as string,\n\t\taddress as string\n\t),\n\tallowSchemaDrift: true,\n\tvalidateSchema: false,\n\tignoreNoFilesFound: false) ~> source1\nsource1 derive(NewSal = round(toFloat(sal),2,2)) ~> DerivedColumn1\nDerivedColumn1 sink(allowSchemaDrift: true,\n\tvalidateSchema: false,\n\tpartitionFileNames:['customer_post_with_round.csv'],\n\tpartitionBy('hash', 1),\n\tskipDuplicateMapInputs: true,\n\tskipDuplicateMapOutputs: true) ~> sink1"
}
}
}
Please help.
Upvotes: 0
Views: 1184
Reputation: 11
In this case the headers in the sink files were missing, and therefore they were not appearing the new ones during the transformation. By adding the option First row as header issue is solved.
Upvotes: 0
Reputation: 3838
I imported your data flow definition in my environment and I do see the column name in the metadata inspect and mapping list. Can you do the same and copy / paste the data flow script into the UI and make sure that you see everything there just fine?
Upvotes: 0