Grevioos
Grevioos

Reputation: 405

Azure Data Factory Data Flow: how to filter input column with multiple values

What I am trying to do is basically something that emulates SQL WHERE IN CLAUSE in a data flow. I want to pass comma separated string of values into my data flow, i.e ptf_proc_link_id = "A, B, C".

enter image description here

Inside of the data flow, I want to use those values, to filter one of my input columns:

enter image description here

It would be equal to writing in SQL SELECT * FROM my_delta_table where ptf_proc_link_id in (A, B, C).

However, data flow documentation only hints that I can filter single values i.e ptf_proc_link_id = A. I have tried to use intersect, but then both parameters needs to be arrays, and in my case one of them is a column. I could do something like ptf_proc_link_id = A OR ptf_proc_link_id = B OR ptf_proc_link_id = C, but I never know how many input values I would have, so I cannot hardcode it.

It seems like a very simple use case for a data flow, so I would be really thankful if someone could explain me what the correct approach would be.

Thank you!

Upvotes: 0

Views: 8231

Answers (2)

You always could declare your variables in your query and equal these to a ADF variable or parameter.For example, in the lookup activity or Query in CopyActivity you could set these: i.e:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SET NOCOUNT ON
--SET FMTONLY OFF

DECLARE 

    @frdate DATETIME = @{variables('FromDate')}, -- created in ADF
    @todate DATETIME = @{variables('ToDate')}; -- created in pipeline

select 1
FROM table c
where
c.var_1 in (@{pipeline().parameters.parameter1}) -- Create the parameter in ADF
AND c.DateTime between @frdate and @todate

Upvotes: 0

Aswin
Aswin

Reputation: 7126

You can use in() function in ADF data flow to filter multiple values.

Syntax: in(array of items,item to find)

  • I tried to repro this with sample input data.
key class name mark DOB
1 1 Arjuna 50 1/2/2015
2 1 Basanta 47 5/1/2015
3 1 Subala 54 5/6/2015
4 2 Gandharva 60 1/2/2014
5 2 Ujjvala 55 9/2/2014
6 2 Sanadhana 64 1/12/2014
7 3 Sridama 75 1/2/2013
8 3 Sudama 80 13/12/2013
9 3 Vasu 81 1/12/2013

enter image description here img:1 Source data preview.

  • I tried to filter the records with key=2,4,6

  • Filter condition can be given as in(['2','4','6'], key) or in(array('2','4','6'), key) enter image description here img:2 Filter condition settings

  • Output data of the filter Transformation:

enter image description here img:3 Filter Transformation data preview

  • You can also create a dataflow parameter of array type and give that parameter in the filter condition of filter transformation. enter image description here img: 4 dataflow parameters

  • Filter condition while using parameter will be in($parameter1,key)

Reference: Microsoft document on data flow expression - in

Upvotes: 2

Related Questions