mp.kaur
mp.kaur

Reputation: 39

Merging two distinct rows using Bigquery

I have a dataset with more than 20 columns. It consists of transactions in cash and non-cash form for the year 2020 and 2021.

If the transaction date lies in 2020, its amount is mentioned.[Blue color] Then for 2021, a random 100 amount is assigned.

Similarly, if the transaction date is in 2021, its amount is mentioned. [Pink color] Then for 2020, a random 100 amount is assigned. enter image description here

REQUIREMENT - For a person , if cash and non cash method exists , can it be combined/ merged as SINGLE row i.e [Blue and Pink figures in a single row] using BigQuery

REQUIRED RESULT - enter image description here

Upvotes: 0

Views: 72

Answers (1)

Mohammad
Mohammad

Reputation: 678

Tried to replicate your scenario here:

with raw_data as(

select 'MP' as Name, 10 as ID, 'Cash' as Method, 'SAN JOSE' AS CITY, 'CA' AS STATE, '2020-01-20' AS TRANSACTION_DATE, 2000 AS Q1_2020,1200 AS Q2_2020, 2000 AS Q3_2020,5080 AS Q4_2020, 100 AS Q1_2021,100 AS Q2_2021, 100 AS Q3_2021 UNION ALL
select 'MP' as Name, 12 as ID, 'Non-Cash' as Method, 'san jose' AS CITY, 'CA' AS STATE, '2021-02-03' AS TRANSACTION_DATE, 100 AS Q1_2020,100 AS Q2_2020, 100 AS Q3_2020,100 AS Q4_2020, 2000 AS Q1_2021,3000 AS Q2_2021, 4000 AS Q3_2021 UNION ALL
select 'KS' as Name, 11 as ID, 'Cash' as Method, 'LAKE FOREST' AS CITY, 'CA' AS STATE, '2020-02-12' AS TRANSACTION_DATE, 2000 AS Q1_2020,4000 AS Q2_2020, 5000 AS Q3_2020,2200 AS Q4_2020, 100 AS Q1_2021,100 AS Q2_2021, 100 AS Q3_2021 UNION ALL
select 'KS' as Name, 34 as ID, 'Non-Cash' as Method, 'lake forest' AS CITY, 'CA' AS STATE, '2021-04-03' AS TRANSACTION_DATE, 100 AS Q1_2020,100 AS Q2_2020, 100 AS Q3_2020,100 AS Q4_2020, 3000 AS Q1_2021,1500 AS Q2_2021, 2500 AS Q3_2021 UNION ALL
select 'GS' as Name, 21 as ID, 'Cash' as Method, 'EDITION' AS CITY, 'NJ' AS STATE, '2020-03-20' AS TRANSACTION_DATE, 9000 AS Q1_2020,1000 AS Q2_2020, 2300 AS Q3_2020,1200 AS Q4_2020, 100 AS Q1_2021,100 AS Q2_2021, 100 AS Q3_2021 UNION ALL
select 'GS' as Name, 22 as ID, 'Non-Cash' as Method, 'edition' AS CITY, 'NJ' AS STATE, '2021-05-12' AS TRANSACTION_DATE, 100 AS Q1_2020,100 AS Q2_2020, 100 AS Q3_2020,100 AS Q4_2020, 2300 AS Q1_2021,3200 AS Q2_2021, 4500 AS Q3_2021 UNION ALL
select 'JSK' as Name, 27 as ID, 'Cash' as Method, 'Piscataway' AS CITY, 'NJ' AS STATE, '2020-02-13' AS TRANSACTION_DATE, 2345 AS Q1_2020,5600 AS Q2_2020, 1200 AS Q3_2020,2100 AS Q4_2020, 100 AS Q1_2021,100 AS Q2_2021, 100 AS Q3_2021 
),

Cleaned_data as (
SELECT Name,STATE,MAX(Q1_2020) Q1_2020, MAX(Q2_2020) Q2_2020,MAX(Q3_2020) Q3_2020, MAX(Q4_2020) Q4_2020,
MAX(Q1_2021) Q1_2021, MAX(Q2_2021) Q2_2021, MAX(Q3_2021) Q3_2021 
FROM raw_data 
GROUP BY Name,STATE)

select rd.Name, rd.ID,rd.Method, rd.CITY,rd.STATE, rd.TRANSACTION_DATE,

 case when rd.Method = 'Non-Cash' then 0 else  CL.Q1_2020 end as Q1_2020,
 case when rd.Method = 'Non-Cash' then 0 else  CL.Q2_2020 end as Q2_2020,
 case when rd.Method = 'Non-Cash' then 0 else  CL.Q3_2020 end as Q3_2020,
 case when rd.Method = 'Non-Cash' then 0 else  CL.Q4_2020 end as Q4_2020,
 case when rd.Method = 'Non-Cash' then 0 else  CL.Q1_2021 end as Q1_2021,
 case when rd.Method = 'Non-Cash' then 0 else  CL.Q2_2021 end as Q2_2021,
 case when rd.Method = 'Non-Cash' then 0 else  CL.Q3_2021 end as Q3_2021
from Cleaned_data  CL 
left join raw_data rd 
on rd.name= CL.Name and rd.STATE=CL.STATE;

Here it output what I got:

enter image description here

Upvotes: 1

Related Questions