screechOwl
screechOwl

Reputation: 28159

Azure / U-SQL - ORDER BY in ROW_NUMBER()

I'm building a U-SQL script that has several intermediate tables. I'd like to assign ROW_NUMBER() based on the ordering of one of the columns but can't figure out how to do so.

@t = SELECT *
     FROM(
        VALUES
        ( 1, "2010-01-01","TagName1", 30 ),
        ( 2, "2010-01-01","TagName2", 5 ),
        ( 2, "2010-01-02","TagName2", 7 ),
        ( 2, "2010-01-02","TagName3", 6 ),
        ( 3, "2010-01-03","TagName2", 15 ),
        ( 1, "2010-01-01","TagName2", 2 ),
        ( 1, "2010-01-01","TagName3", 1),
        ( 3, "2010-01-04","TagName1", 2 ),
        ( 3, "2010-01-04","TagName2", 4 )
     ) AS T(DeviceID, Date, TagName, dv);

@t2 = SELECT DISTINCT(dv) as dv
             , ROW_NUMBER() OVER() as LevelNumber
      FROM @t
      ORDER BY dv;

If I run the above code it tells me to move the ORDER BY to the OUTPUT statement but then this doesn't work when the table contents changes in subsequent manipulations.

I tried putting the column name in the OVER() part but this doesn't work either.

I'd like something like this:

  dv          LevelNumber
  1                1
  2                2      
  4                3
  5                4

Any suggestions?

Upvotes: 1

Views: 474

Answers (1)

wBob
wBob

Reputation: 14389

You can include ORDER BY and/or PARTITION BY clauses in the OVER clause. You can move the ORDER BY clause to the OUTPUT statement, eg

@t = SELECT *
     FROM(
        VALUES
        ( 1, "2010-01-01","TagName1", 30 ),
        ( 2, "2010-01-01","TagName2", 5 ),
        ( 2, "2010-01-02","TagName2", 7 ),
        ( 2, "2010-01-02","TagName3", 6 ),
        ( 3, "2010-01-03","TagName2", 15 ),
        ( 1, "2010-01-01","TagName2", 2 ),
        ( 1, "2010-01-01","TagName3", 1),
        ( 3, "2010-01-04","TagName1", 2 ),
        ( 3, "2010-01-04","TagName2", 4 )
     ) AS T(DeviceID, Date, TagName, dv);


//@t2 = SELECT DISTINCT(dv) as dv
//             , ROW_NUMBER() OVER() as LevelNumber
//      FROM @t
//      ORDER BY dv;


@output =
    SELECT DeviceID,
           Date,
           TagName,
           dv,
           ROW_NUMBER() OVER( ORDER BY dv ) AS LevelNumber
    FROM @t;


OUTPUT @output TO "/output/output.csv"
ORDER BY dv
USING Outputters.Csv();

The closest I could get to your result was with this query, but the logic you need is not 100% clear to me:

@output =
    SELECT dv,
           ROW_NUMBER() OVER(ORDER BY dv) AS LevelNumber
    FROM
    (
        SELECT DISTINCT dv
        FROM @t
    ) AS x;

Upvotes: 2

Related Questions