Reputation: 85
I have the following table. basically simplified version of my table. I need to aggregate few columns, I will explain what I am trying to do and also what I have written till now.
tableName
food.id STRING NULLABLE
food.basket.id STRING NULLABLE
food.foodType STRING NULLABLE
food.price INTEGER NULLABLE
food.printed BOOLEAN NULLABLE
food.variations RECORD REPEATED
food.variations.id INTEGER REPEATED
food.variations.amount INTEGER NULLABLE
Sample data
id basket.id. foodType. price. printed. variations.id variations.amount
1. abbcd. JUNK. 100. TRUE. NULL. NULL
2. cdefg. PIZZA. 200. TRUE. 1234. 10
2345. 20
5678. 20
3. abbcd. JUNK. 200. FALSE. 1234. 10
4. uiwka. TOAST. 500. FALSE. NULL. NULL
variations can be like pizza toppings, each variation has an amount, say veggie toppings cost 10 cent and meat toppings cost 20 cents for simplicity
so now I am trying to aggregate some data for this table
I am trying to get
This is the query I have:
select SUM(CASE When item.printed = TRUE Then 1 Else 0 End ) as printed,
SUM(CASE When item.printed = FALSE Then 1 Else 0 End) as nonPrinted,
SUM(item.price) as price,
(select COUNT(DISTINCT(item.basket.id)) from tableName where itemType = "JUNK") AS baskets,
(select SUM(CASE when m.amount is NULL then 0 Else m.amount END) as variations_total from tableName, UNNEST(item.variations) as m) as variations
from tableName;
printed. unprinted. price. baskets. variations.
2. 2. 1000. 1. 60
Now I get the result that I expect. I am trying to understand if we can do this without using subqueries and use only joins?
Upvotes: 1
Views: 72
Reputation: 172944
Below is for BigQuery Standard SQL and assumes that your query is really working (saying this because your data example does not exactly fit into query you provided)
So, below two subqueries
(select COUNT(DISTINCT(item.basket.id)) from tableName where itemType = "JUNK") AS baskets,
(select SUM(CASE when m.amount is NULL then 0 Else m.amount END) as variations_total from tableName, UNNEST(item.variations) as m) as variations
can be replace with
COUNT(DISTINCT IF(itemType = "JUNK", item.basket.id, NULL)) AS baskets,
SUM((SELECT SUM(amount) FROM item.variations)) AS variations
Believe me or not - but result will be the same
Row printed nonPrinted price baskets variations
1 2 2 1000 1 60
So, as you can see yo don't need subqueries and you don't need joins here either
Note: in the second row - (SELECT SUM(amount) FROM item.variations)
is not really the same type of subquery as in your original query. Rather here for each row you query its array to find sum of amount in that row which is then being aggregated to total sum ...
Hope you get it
Upvotes: 2