Reputation: 3
This is only my second time using Stack Overflow so I'm open to any constructive criticism on how to better format my questions.
I have a list of orders and I'm wanting to create a table of useful customer information.
I've created a new table that identifies unique customers (using just a select distinct of customer IDs) but I'm not sure of the proper functions to accurately group them and produce a boolean value based on their accompanying fields.
I need the new fields to display a boolean for whether or not any of the customer's orders have been for a particular product SKU.
Imagine this is the source table
NAME | PRODUCT
------------
Andy | 1
Bill | 2
Cole | 2
Andy | 2
Bill | 1
Cole | 2
Dave | 3
I'm wanting the output to only have unique values for each name accompanied by a boolean displaying whether or not any record of that given name has received that product.
NAME | HAS1 | HAS2 | HAS3
--------------------------
Andy | true | true | false
Bill | true | true | false
Cole | false | true | false
Dave | false | false | true
Upvotes: 0
Views: 970
Reputation: 173046
Below is for BigQuery Standard SQL
If you know in advance product names (like '1', '2', '3' in your example) and there are just few - you can use below simple version
#standardSQL
SELECT name,
MAX(product = '1') AS has1,
MAX(product = '2') AS has2,
MAX(product = '3') AS has3
FROM `project.dataset.table`
GROUP BY name
If to apply to sample data from your question (I assume your product are of string data type here)
WITH `project.dataset.table` AS (
SELECT 'Andy' name, '1' product UNION ALL
SELECT 'Bill', '2' UNION ALL
SELECT 'Cole', '2' UNION ALL
SELECT 'Andy', '2' UNION ALL
SELECT 'Bill', '1' UNION ALL
SELECT 'Cole', '2' UNION ALL
SELECT 'Dave', '3'
)
result is
Row name has1 has2 has3
1 Andy true true false
2 Bill true true false
3 Cole false true false
4 Dave false false true
In case if product names are not known in advance and/or number of products more than just few - below version can be handy
EXECUTE IMMEDIATE '''
SELECT name,''' || (
SELECT STRING_AGG(DISTINCT "MAX(product = '" || product || "') AS has" || product)
FROM `project.dataset.table`
) || '''
FROM `project.dataset.table`
GROUP BY name
'''
with exact same output
As you can see here - whole query is assembled dynamically so you don't need to worry about number of products and their names
Below version is identical to above, but easier to manage/read
EXECUTE IMMEDIATE FORMAT('''
SELECT name, %s
FROM `project.dataset.table`
GROUP BY name
''', (
SELECT STRING_AGG(DISTINCT "MAX(product = '" || product || "') AS has" || product)
FROM `project.dataset.table`
))
Upvotes: 1
Reputation: 1270401
You can use aggregation and simple logic:
select name,
countif(product = 1) > 0 as has_1,
countif(product = 2) > 0 as has_2,
countif(product = 3) > 0 as has_3
from t
group by name;
You can also use logical_or()
:
select name,
logic_or(product = 1) as has_1,
logic_or(product = 2) as has_2,
logic_or(product = 3) as has_3
from t
group by name;
Upvotes: 0