Reputation: 1137
I have a table that looks like the following:
col1 | col2 | col3 | col4
A | 1 | 2 | 4
A | 2 | 5 | 3
A | 5 | 1 | 6
B | 3 | 1 | 2
B | 4 | 4 | 4
I have another table where the records are unique and looks like the following:
col1 | col2
A | 2
B | 1
I want to query Table 1
in such a way that I filter out only n
number of records for each category in Table 1
based on the value the categories have in Table 2
.
Based on Table 2
I need to extract 2 records for A
and 1 record for B
. I need the resulting queried table to look like the following:
col1 | col2 | col3 | col4
A | 2 | 5 | 3
A | 1 | 2 | 4
B | 3 | 1 | 2
The choice of the records are made based on col4
sorted in ascending order. I am currently tring to do this on BigQuery.
Upvotes: 3
Views: 66
Reputation: 173003
Below is for BigQuery Standard SQL
#standardSQL
SELECT t.*
FROM (
SELECT ARRAY_AGG(t1 ORDER BY t1.col4) arr, MIN(t2.col2) cnt
FROM table1 t1 JOIN table2 t2 ON t1.col1 = t2.col1
GROUP BY t1.col1
), UNNEST(arr) t WITH OFFSET num
WHERE num < cnt
you can test / play with it using dummy data from your question as below
#standardSQL
WITH `table1` AS (
SELECT 'A' col1, 1 col2, 2 col3, 4 col4 UNION ALL
SELECT 'A', 2, 5, 3 UNION ALL
SELECT 'A', 5, 1, 6 UNION ALL
SELECT 'B', 3, 1, 2 UNION ALL
SELECT 'B', 4, 4, 4
), `table2` AS (
SELECT 'A' col1, 2 col2 UNION ALL
SELECT 'B', 1
)
SELECT t.*
FROM (
SELECT ARRAY_AGG(t1 ORDER BY t1.col4) arr, MIN(t2.col2) cnt
FROM table1 t1 JOIN table2 t2 ON t1.col1 = t2.col1
GROUP BY t1.col1
), UNNEST(arr) t WITH OFFSET num
WHERE num < cnt
with output as
Row col1 col2 col3 col4
1 A 2 5 3
2 A 1 2 4
3 B 3 1 2
Upvotes: 1
Reputation: 1269943
You can use row_number()
and join
:
select t1.col1, t1.col2, t1.col3, t1.col4
from (select t1.*, row_number() over (partition by col1 order by col4) as seqnum
from table1 t1
) t1 join
table2 t2
on t2.col1 = t1.col1 and t1.seqnum <= t2.col2
order by t1.col1, t1.col4;
Upvotes: 2