murtaza.webdev
murtaza.webdev

Reputation: 3641

bigquery row_number group by some rows based on specific field

I have data like this i.e.

cityName      stateId
cityText1        52
cityText2        52
cityText3        52

cityExp1         72
cityExp2         72
cityExp3         72

city1            21
city2            21

i am using SUB QUERIES FOR GETTING DATA. now with BIGQUERY I want data like this:

cityName      rowNumber
cityText1        1
cityText2        1
cityText3        1

cityExp1         2
cityExp2         2
cityExp3         2

city1            3
city2            3

I have tried to use row_number() but it gives unique number to every row. so is this possible how i want.

Upvotes: 2

Views: 1327

Answers (1)

Pentium10
Pentium10

Reputation: 207952

You need first to join them with a scalar value to be one partition, then you can apply the ROW_NUMBER on that partition.

Update: scroll to the bottom of the answer to see a suggestion without using scalar.

SELECT stateId,
       row_number() over (partition BY scalar) AS INDEX
FROM
  (SELECT stateId,
          1 AS scalar
   FROM
     (SELECT 'cityText1' AS cityName,
             52 AS stateId),
     (SELECT 'cityText2' AS cityName,
             52 AS stateId),
     (SELECT 'cityText3' AS cityName,
             52 AS stateId),
     (SELECT 'cityExp1' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp2' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp3' AS cityName,
             72 AS stateId),
     (SELECT 'city1' AS cityName,
             21 AS stateId),
     (SELECT 'city2' AS cityName,
             21 AS stateId)
   GROUP BY stateId) d

this returns:

+-----+---------+-------+---+
| Row | stateId | index |   |
+-----+---------+-------+---+
|   1 |      52 |     1 |   |
|   2 |      72 |     2 |   |
|   3 |      21 |     3 |   |
+-----+---------+-------+---+

Then you can join the table again and prepare the final output. For our static tables it's a pretty long query:

SELECT t.cityName,
       t.stateId,
       d.index
FROM
  (SELECT *
   FROM
     (SELECT 'cityText1' AS cityName,
             52 AS stateId),
     (SELECT 'cityText2' AS cityName,
             52 AS stateId),
     (SELECT 'cityText3' AS cityName,
             52 AS stateId),
     (SELECT 'cityExp1' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp2' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp3' AS cityName,
             72 AS stateId),
     (SELECT 'city1' AS cityName,
             21 AS stateId),
     (SELECT 'city2' AS cityName,
             21 AS stateId)) t
JOIN
  (SELECT stateId,
          row_number() over (partition BY scalar) AS INDEX
   FROM
     (SELECT stateId,
             1 AS scalar
      FROM
        (SELECT 'cityText1' AS cityName,
                52 AS stateId),
        (SELECT 'cityText2' AS cityName,
                52 AS stateId),
        (SELECT 'cityText3' AS cityName,
                52 AS stateId),
        (SELECT 'cityExp1' AS cityName,
                72 AS stateId),
        (SELECT 'cityExp2' AS cityName,
                72 AS stateId),
        (SELECT 'cityExp3' AS cityName,
                72 AS stateId),
        (SELECT 'city1' AS cityName,
                21 AS stateId),
        (SELECT 'city2' AS cityName,
                21 AS stateId)
      GROUP BY stateId)) d ON d.stateId=t.stateId

this returns the final output:

+-----+------------+-----------+---------+---+
| Row | t_cityName | t_stateId | d_index |   |
+-----+------------+-----------+---------+---+
|   1 | cityText1  |        52 |       1 |   |
|   2 | cityText2  |        52 |       1 |   |
|   3 | cityText3  |        52 |       1 |   |
|   4 | cityExp1   |        72 |       2 |   |
|   5 | cityExp2   |        72 |       2 |   |
|   6 | cityExp3   |        72 |       2 |   |
|   7 | city1      |        21 |       3 |   |
|   8 | city2      |        21 |       3 |   |
+-----+------------+-----------+---------+---+

Update: After the update without scalar the query becomes:

SELECT stateId,
       row_number() over () AS INDEX
FROM

     (SELECT 'cityText1' AS cityName,
             52 AS stateId),
     (SELECT 'cityText2' AS cityName,
             52 AS stateId),
     (SELECT 'cityText3' AS cityName,
             52 AS stateId),
     (SELECT 'cityExp1' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp2' AS cityName,
             72 AS stateId),
     (SELECT 'cityExp3' AS cityName,
             72 AS stateId),
     (SELECT 'city1' AS cityName,
             21 AS stateId),
     (SELECT 'city2' AS cityName,
             21 AS stateId)
   group by stateId

Upvotes: 4

Related Questions