SQL: Group rows by field a or field b

Question

I am not sure this is possible in SQL. I have two fields "a" and "b" in my table. I want to group records in the table when rows have the same "a" values or the same "b" values and then extract a value of another field with the latest timestamp. Would this be possible? I am using Redshift.

Here is an example table:

|| key1 || key2 || date || value ||
| 1 | 'a' | '2019-01-01' | 'john' |
| 1 | 'b' | '2019-01-03' | 'ben' |
| 2 | 'c' | '2019-02-01' | 'kim' |
| 3 | 'c' | '2019-02-02' | 'dan' |
| 4 | 'd' | '2019-02-03' | 'jan' |

Eventually what I want to have is:

| 'ben' |
| 'dan' | 
| 'jan' |

D-Shih · Accepted Answer

You can try to write a result set by ROW_NUMBER and SUM window function for those two keys.

Then get rn = 1 latest row from the group.

Query 1:

SELECT value 
FROM (
  SELECT value,
         sum(1) OVER(PARTITION BY key1 ORDER BY key1) cnt1,
         sum(1) OVER(PARTITION BY key2 ORDER BY key2) cnt2 ,
         ROW_NUMBER () OVER(PARTITION BY key1 ORDER BY date DESC) rn1,
         ROW_NUMBER () OVER(PARTITION BY key2 ORDER BY date DESC) rn2
  FROM T
) t1
WHERE 
   (rn1 = 1 and cnt1 > 1) 
or
   (rn2 = 1 and cnt2 > 1)
or 
   (cnt1 = 1 and cnt2 = 1)

Results:

| value |
|-------|
|   ben |
|   dan |
|   jan |

SQL: Group rows by field a or field b

Answers (2)

Related Questions