Nivi
Nivi

Reputation: 1147

Merging records based on condition in bigquery

I have multiple rows for members and want to merge them based on the values of two columns by giving priority to the value 'Yes'.

Name | Status1  |  Status2
Jon  |  Yes     |  No
Jon  |  No      |  Yes

I want the query to return

Name | Status1 | Status2
Jon  |  Yes    |  Yes

So, if the column has Yes even once, it has to assign Yes for the person and No otherwise.

Upvotes: 2

Views: 1179

Answers (2)

Eray Balkanli
Eray Balkanli

Reputation: 8000

In addition to Mikhail's answer, I am adding another solution with MsSQL. Syntax may be different but the logic would be similar:

create table test
(id int , name1 varchar(10), name2 varchar(10))

insert into test values (1,'yes','no')
insert into test  values (2,'no','no')
insert into test values (3,'yes','yes')

declare @searchKey varchar(10) = 'yes'
declare @cols varchar(255) = (SELECT STUFF((
    SELECT ', ' + c.name 
    FROM sys.columns  c
    JOIN sys.types AS t ON c.user_type_id=t.user_type_id
    WHERE t.name != 'int'  AND t.name != 'bit' AND t.name !='date' AND t.name !='datetime'
        AND object_id =(SELECT object_id FROM sys.tables WHERE name='test')
    FOR XML PATH('')),1,2,''))

declare @sql nvarchar(max) = 'SELECT * from test where '''+@searchKey+''' in ('+@cols+')'
exec sp_executesql @sql

Edit: Please note that this solution checks all the columns of a table if a specific value is included by any column. Assume the OP needs to check 100 columns, until status100, then I believe a dynamic solution like that would be more handy.

Upvotes: 0

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173210

Below is for BigQuery Standard SQL

#standardSQL
SELECT Name, MAX(Status1) AS Status1, MAX(Status2) AS Status2
FROM `project.dataset.table`
GROUP BY Name   

You can test, play with it using sample data

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'Jon' Name, 'Yes' Status1, 'No' Status2 UNION ALL
  SELECT 'Jon', 'No', 'Yes' 
)
SELECT Name, MAX(Status1) AS Status1, MAX(Status2) AS Status2
FROM `project.dataset.table`
GROUP BY Name  

with result

Row Name    Status1 Status2  
1   Jon     Yes     Yes  

Upvotes: 5

Related Questions