Tony Nguyễn
Tony Nguyễn

Reputation: 3

Choose value which pointed to highest value

I have a table like this:

enter image description here

I want to transform the table into this:

enter image description here

To get the table 2, I group by value in column 1, then pick the values in column 2 where duration is highest value. Can anyone help me with the syntax?

Upvotes: 0

Views: 31

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

Below is for Bigquery Standard SQL

#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY duration DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY column1   

You can test, play with above using sample data from your question as in example below

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'A' column1, 100 column2, 3 duration UNION ALL
  SELECT 'A', 101, 7 UNION ALL
  SELECT 'A', 102, 5 UNION ALL
  SELECT 'B', 108, 5 UNION ALL
  SELECT 'B', 109, 9 
)
SELECT AS VALUE ARRAY_AGG(t ORDER BY duration DESC LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY column1  

with result

Row column1 column2 duration     
1   A       101     7    
2   B       109     9    

Upvotes: 0

Ben P
Ben P

Reputation: 3369

One solution is to add a ROW_NUMBER() to your data which is ordered on the "Duration" field, then filter on that row number so you are only selecting the highest values from the partition.

Here is an example with a dummy data based on your screenshot:

WITH table AS (
  SELECT 1 column1, 100 column2, 3 Duration UNION ALL
  SELECT 1, 101, 7 UNION ALL
  SELECT 1, 102, 5 UNION ALL
  SELECT 2, 108, 5 UNION ALL
  SELECT 2, 109, 9 
)

    SELECT * EXCEPT (rn) FROM(
    SELECT column1, 
    column2, 
    Duration, 
    ROW_NUMBER() OVER(PARTITION BY column1 ORDER BY Duration DESC) AS rn 
    FROM table
    )
    WHERE rn = 1

Which returns the result:

enter image description here

Upvotes: 1

Related Questions