Reputation: 323
I have a simple query to obtain the currency rate in use at the time a transaction was created:
SELECT t.orderid, t.date,
(SELECT rate FROM sources.currency_rates r WHERE currencyid=1 AND
r.date>=t.date ORDER BY date LIMIT 1) rate
FROM sources.transactions t
This triggers an error:
Error: Correlated subqueries that reference other tables are not
supported unless they can be de-correlated, such as by transforming
them into an efficient JOIN.'
I've tried with several types of joins and named subqueries, but none seem to work. What is the best way to accomplish this? Seems like a very common scenario that should be quite straightforward to implement in BQ's Standard Sql.
Upvotes: 28
Views: 37952
Reputation: 173190
Below is for BigQuery Standard SQL
#standardSQL
SELECT
t.orderid AS orderid,
t.date AS date,
ARRAY_AGG(r.rate ORDER BY r.date LIMIT 1)[SAFE_OFFSET(0)] AS rate
FROM `sources.transactions` AS t
JOIN `sources.currency_rates` AS r
ON currencyid = 1
AND r.date >= t.date
GROUP BY orderid, date
Upvotes: 17
Reputation: 59325
I've noticed similar behavior with other correlated subqueries. They are useful, but can't always be automatically modeled to JOINs by BigQuery.
Similar case which works:
#standardSQL
SELECT name, (
SELECT AVG(temp)
FROM `bigquery-public-data.noaa_gsod.gsod2017` b
WHERE a.usaf=b.stn
) temp
FROM `bigquery-public-data.noaa_gsod.stations` a
LIMIT 10
Doesn't work:
#standardSQL
SELECT name, (
SELECT temp
FROM `bigquery-public-data.noaa_gsod.gsod2017` b
WHERE a.usaf=b.stn
ORDER BY da
LIMIT 1
) temp
FROM `bigquery-public-data.noaa_gsod.stations` a
LIMIT 10
Fix:
#standardSQL
SELECT name, ARRAY_AGG(temp ORDER BY da LIMIT 1) temp
FROM `bigquery-public-data.noaa_gsod.stations` a
JOIN `bigquery-public-data.noaa_gsod.gsod2017` b
ON a.usaf=b.stn
GROUP BY 1
LIMIT 10
(give me a public dataset, and I'll write a query that works with your data)
Upvotes: 10