Uri Kluk
Uri Kluk

Reputation: 185

How to run U-SQL for all the files in a folder using parameters from ADF?

Can't pass the "in" parameter to U-SQL to use all the files in the folder.

in my ADF pipeline, I have the following parameters settings:

"parameters": {
    "in": "$$Text.Format('stag/input/{0:yyyy}/{0:MM}/{0:dd}/*.csv', SliceStart)",
    "out": "$$Text.Format('stag/output/{0:yyyy}/{0:MM}/{0:dd}/summary.csv"
}

And the U-SQL script trys to extract from:

@couponlog =
    EXTRACT 
    Id int,
    [Other columns here]
FROM @in
USING Extractors.Csv(skipFirstNRows:1);

But I get file not found during execution. The files exists in the data lake but I don't know the correct syntax to pass it as a parameter.

Upvotes: 1

Views: 1299

Answers (3)

Atsu
Atsu

Reputation: 13

I use this input parameter in ADF to read all the files from a folder with a virtual column (file) to retrieve the name of the file

"parameters": {
    "in": "$$Text.Format('storage/folder/{0:yyyy}-{0:MM}/{1}.csv', SliceStart, '{file:*}')",
    "out": "$$Text.Format('otherFolder/{0:yyyy}-{0:MM}/result.txt', SliceStart)"
}

The related U-SQL

@sales =
    EXTRACT column1 string,
            column2 decimal,
            file string
    FROM @in
    USING Extractors.Csv(silent : true);

Upvotes: 1

user4999138
user4999138

Reputation:

I'm using dates inputted by ADF with no trouble. I pass in just the date portion and then format it within USQL:

"parameters": {
  "in": "$$Text.Format('{0:yyyy}/{0:MM}/{0:dd}/', SliceStart)"
}

Then in USQL:

DECLARE @inputPath = "path/to/file/" + @in + "{*}.csv";
DECLARE @outputPath = "path/to/file/" + @in + "output.csv";

Those variables then get used in the script as needed.

Upvotes: 1

Uri Kluk
Uri Kluk

Reputation: 185

I am sure there are many ways to solve the issue, but what I found is that instead of passing a parameter from the ADF pipeline, it is easier to use virtual columns. in my case v_date

@couponlog =
    EXTRACT 
    Id int,
    [Other columns here],
    v_date DateTime
FROM "stag/input/{v_date:yyyy}/{v_date:MM}/{v_date:dd}/{*}.csv"
USING Extractors.Csv(skipFirstNRows:1);

With this the U-SQL scrip found all the files

Upvotes: 2

Related Questions