yhxhappy
yhxhappy

Reputation: 11

How to use U-SQL scripts to add elements of two csv files?

I'm trying to use U-SQL scripts in Azure Data Lake Analytics(ADLA) to process two csv files uploaded to Azure Data Lake Store(ADLS). There is one row and three columns in the CSV file. I'm not clear how to use U-SQL scripts to add the three elements of each file and put the results into a new CSV file. Could anyone help me with the problem?

Upvotes: 1

Views: 389

Answers (2)

wBob
wBob

Reputation: 14389

If your files are in the same folder then you don't need to UNION anything. Simply use the filesets and virtual columns to refer to them. Here is a simple example:

@input =
    EXTRACT colA int,
            colB string,
            colC DateTime?,
            filename string
    FROM "/input/{filename}.log"
    USING Extractors.Csv();


// Do some processing if you need
@output =
    SELECT DISTINCT *
    FROM @input;


// Output results
OUTPUT @output
TO "/output/output.csv"
USING Outputters.Csv();

In this example, I have two files of the same structure in my input directory of file type .log. When I run the script the two files are effectively UNIONed together in one resultset.

Upvotes: 2

Andrey Vykhodtsev
Andrey Vykhodtsev

Reputation: 1061

If I understand your question right, you need to output 3 rows from your CSV files, where each file has 1 row and 3 columns. The way to do it would be to use UNION operation in U-SQL like it is described here:

    @result = 
        SELECT * FROM @f1
        UNION ALL BY NAME ON (*)
        SELECT * FROM @f2
        UNION ALL BY NAME ON (*)
        SELECT * FROM @f3;  

OUTPUT @result 
TO "pathtoyourfile.csv" 
USING Outputters.Csv();

Upvotes: 1

Related Questions