Adiansyah
Adiansyah

Reputation: 353

Bigquery SQL - WHERE IN Col Value

so i have reference table (table A) like this

| cust_id | prod |
|   1     | A, B |
|   2     | C, D, E|

This reference table will be joined by transaction history like table (table B)

| trx_id | cust_id | prod | amount
|   1    |    1    | A    |  10
|   2    |    1    | B    |  5
|   3    |    1    | C    |  1
|   4    |    1    | D    |  6

i want to get sumup value of table b amount, but the list of products is only obtained from table A. i tried something like this but doesn't work

SELECT A.cust_id
, SUM(B.amount) AS amount
FROM A 
INNER JOIN B ON A.cust_id = B.cust_id
AND B.prod IN(A.prod)
GROUP BY 1

Upvotes: 0

Views: 69

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Below is for BigQuery Standard SQL

#standardSQL
select cust_id, 
  sum(if(b.prod in unnest(split(a.prod, ', ')), amount, 0)) as amount
from `project.dataset.tableB` b
join `project.dataset.tableA` a
using(cust_id)
group by cust_id    

Also, note: for BigQuery - in general - storing multiple values in a string is a really good idea. :o)
See Denormalize data whenever possible for more on this

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269803

Hmmm . . . Try splitting the prod and join on that:

SELECT A.cust_id, SUM(B.amount) AS amount
FROM A CROSS JOIN
     UNNEST(SPLIT(a.prod, ', ')) p JOIN
     B
     ON A.cust_id = B.cust_id AND B.prod = p
GROUP BY 1;

Note: Storing multiple values in a string is a really bad idea. You can use a separate junction table (one row per customer and product) or use an array.

Upvotes: 3

Related Questions