Reputation: 784
I have the below query :
SELECT DISTINCT Summed, ROW_NUMBER () OVER (order by Summed desc) as Rank from table1
I have to write it in Apache Beam(beamSql). Below is my code :
PCollection<BeamRecord> rec_2_part2 = rec_2.apply(BeamSql.query("SELECT DISTINCT Summed, ROW_NUMBER(Summed) OVER (ORDER BY Summed) Rank1 from PCOLLECTION "));
But I'm getting the below error :
Caused by: java.lang.UnsupportedOperationException: Operator: ROW_NUMBER is not supported yet!
Any idea how to implement ROW_NUMBER() in beamSql ?
Upvotes: 1
Views: 568
Reputation: 784
The solution which worked for the above query:
PCollection<BeamRecord> rec_2 = rec_1.apply(BeamSql.query("SELECT max(Summed) as maxed, max(Summed)-10 as least, 'a' as Dummy from PCOLLECTION"));
Upvotes: 0
Reputation: 521674
Here is one way you can approximate your current query without using ROW_NUMBER
:
SELECT
t1.Summed,
(SELECT COUNT(*) FROM (SELECT DISTINCT Summed FROM table1) t2
WHERE t2.Summed >= t1.Summed) AS Rank
FROM
(
SELECT DISTINCT Summed
FROM table1
) t1
The basic idea is to first subquery to get a table with only distinct Summed
values. Then, use a correlated subquery to simulate the row number. This isn't a very efficient method, but if ROW_NUMBER
is not available, then you're stuck with some alternative.
Upvotes: 1