kevzettler
kevzettler

Reputation: 5213

How to modify dynamic complex data type fields in azure data factory data flow

I have a complex data type (fraudData) that undesirably has hyphen characters in the field names I need to remove or change the hypens to some other character.

The input schema of the complex object looks like: enter image description here

I have tried using the "Select" and "Derive Column" data flow functions and adding a custom mapping. It seems both functions have the same mapping interface. My current attempt with Select is: enter image description here

This gets me close to the desired results. I can use the replace expression to convert hypens to underscores.

The problem here is that this mapping creates new root level columns outside of the fraudData structure. I would like to preserve the hierarchy of the fraudData structure and modify the column names in place.

If I am unable to modify the fraudData in place. Is there any way I can take the new columns and merge them into another complex data type?

Update:. I do not know the fields of the complex data type in advance. This is a schema drift problem. This is why I have tried using the pattern matching solution. I will not be able to hardcode out kown sub-column names.

Upvotes: 1

Views: 1696

Answers (1)

Aswin
Aswin

Reputation: 7156

You can rename the sub-columns of complex data type using derived column transformation and convert them as a complex data type again. I tried this with sample data and below is the approach.

  • Sample complex data type column with two sub fields are taken as in below image. enter image description here img:1 source data preview

  • In Derived column transformation, For the column fraudData, expression is given as

@(fraudData_1_chn=fraudData.{fraudData-1-chn},
fraudData_2_chn=fraudData.{fraudData-2-chn})

enter image description here img:2 Derived column settings

  • This expression renames the subfields and nests them under the parent column fraudData.

enter image description here img:3 Transformed data- Fields are renamed.

Update: To rename sub columns dynamically

  • You can use below expression to rename all the fields under the root column fraudData.
@(each(fraudData, match(true()), replace($$,'-','_') =  $$))

enter image description here

This will replace fields which has - with _.

gif112

You can also use pattern match in the expression.

@(each(fraudData, patternMatch(`fraudData-.+` ), replace($$,'-','_') = $$))

This expression will take fields with pattern fraudData-.+ and replace - with _ in those fields only.

Reference:

  1. Microsoft document on script for hierarchical definition in data flow.
  2. Microsoft document on building schemas using derived column transformation .

Upvotes: 0

Related Questions