Reputation: 1071
I have 2 tables in BigQuery.
The first table is user_id table, each user has many labels associated with the user (label1, label2, label3, ...). The second table is product_id table, and each product also has the same number of labels associated with it (label1, label2, label3, ...)
Table 1:
user_id, label1, label2, label3, ... (hundreds of columns)
001 , 1 , 2 , 0 , ...
002 , 2 , 0 , 1 , ...
Table 2:
product_id, label1, label2, label3, ... (hundreds of columns)
a , 0 , 3 , 1 , ...
b , 1 , 2 , 0 , ...
I'd like to write a sql script to generate the following table. The label columns are calculated by labelX in user_id table - labelX in product_id table. For example, the label1 cell for the row with user_id=001 and product_id=a is calculated by 001's label1 - a's label1 = 1-0 = 1.
user_id, product_id, label1, label2, label3, ... (hundreds of columns)
001 , a , 1 , -1 , -1 , ...
001 , b , 0 , 0 , 0 , ...
002 , a , 2 , -3 , 0 , ...
002 , b , 1 , -2 , 1 , ...
Upvotes: 0
Views: 842
Reputation: 825
You can cross join both tables. The main query that you need to execute is :
select
a.user_id, b.product_id,
(a.label1 - b.label1) as label1,
(a.label2 - b.label2) as label2,
(a.label3 - b.label3) as label3,
...
from table1 as a
cross join table2 as b
This will require you to dynamically generate the query though based on labels. You can use programming language, or bigquery scripting as below. Make sure to replace the label count, and table names in the query.
DECLARE label_clause STRING;
SET (label_clause) = (
select as struct STRING_AGG('(a.label' || i || '-b.label'|| i ||') as label' || i, ',')
from unnest(generate_array(1,100)) i
);
EXECUTE IMMEDIATE
FORMAT("select a.user_id, b.product_id, %s from table1 as a cross join table2 as b", label_clause)
Upvotes: 1