Reputation: 13088
Given a table like this (named my_table
)
| A | B | C |
+----+---+---+
| -1 | a | x |
| 0 | b | x |
| 1 | c | x |
| 2 | d | y |
| 3 | e | z |
How do I do the equivalent of the following in Google BigQuery?
SELECT MIN(A) as A_min, B, C
FROM my_table
GROUP BY C
WHERE A > 0
The expected result is
| A_min | B | C |
+--------+---+---+
| 1 | c | x |
| 2 | d | y |
| 3 | e | z |
The query above works fine in SQL (and just about any other SQL-ish engine I know of), but in BigQuery, it returns the error:
SELECT list expression references B which is neither grouped nor aggregated
I get why that error happens. My question is: how do I find the corresponding row in column B
for MIN(A), WHERE A > 0
?
Some of the proposed answers worked great for the trivial sample table that only includes one unique value for C
. In a more realistic case, there will be many values for C
, hence the need to aggregate on it.
Upvotes: 0
Views: 354
Reputation: 173046
Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.my_table` AS (
SELECT -1 A, 'a' B, 'x' C UNION ALL
SELECT 0, 'b', 'x' UNION ALL
SELECT 1, 'c', 'x' UNION ALL
SELECT 2, 'd', 'x' UNION ALL
SELECT 3, 'e', 'x'
)
SELECT ARRAY_AGG(STRUCT(A AS A_min, B) ORDER BY A LIMIT 1)[OFFSET(0)].*, C
FROM `project.dataset.my_table`
WHERE A > 0
GROUP BY C
with result
Row A_min B C
1 1 c x
A little less verbose option would be
#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY A LIMIT 1)[OFFSET(0)]
FROM `project.dataset.my_table` t
WHERE A > 0
GROUP BY C
with result
Row A B C
1 1 c x
Upvotes: 2
Reputation: 521997
You could use ROW_NUMBER
here:
SELECT A, B, C
FROM
(
SELECT *, ROW_NUMBER() OVER (ORDER BY A) rn
FROM my_table
WHERE A > 0
) t
WHERE rn = 1;
This approach has the advantage of flexibility, because if you instead want all smallest A
records with ties, all you need to do is replace ROW_NUMBER
with RANK
.
Upvotes: 2
Reputation: 13006
Include the columns on your group by clause.
SELECT MIN(A) as A_min, B, C
FROM my_table
WHERE A > 0
GROUP BY C, B
or
SELECT t2.*, t1.C
FROM my_table t1
INNER JOIN
(SELECT MIN(A) as A_min, B
FROM my_table
WHERE A > 0
GROUP BY B) t2 on t2.A_min = t1.A and t2.B = t1.B
Upvotes: 1
Reputation: 243
Try this:
SELECT *
FROM my_table
WHERE A = (select min(A) from my_table where A > 0)
you can also try:
Select B
from my_table
where A > 0
order by A ASC
LIMIT 1;
depending on the need of this query.
Upvotes: 0