user7463780
user7463780

Reputation:

Big Query - Transpose Specific fields into Columns

We have a table in Big Query like below.

Input table:

 Name | Question  | Answer
 -----+-----------+-------
 Bob  | Interest  | a     
 Sue  | Interest  | a
 Sue  | Interest  | b
 Joe  | Interest  | b
 Joe  | Gender    | Male
 Bob  | Gender    | Female
 Sue  | DOB       | 2020-10-17

We want to convert the above table to the below format to make it BI/Visualisation friendly.

Target/Required table:

 +----------------------------------------+
 | Name | a | b | c | Gender | DOB        |
 +----------------------------------------+
 | Bob  | 1 | 0 | 0 | Female | 2020-10-17 |
 | Sue  | 1 | 1 | 0 |   -    |     -      |
 | Joe  | 0 | 1 | 0 |  Male  |     -      |
 +----------------------------------------+

Upvotes: 0

Views: 208

Answers (3)

Paul Maxwell
Paul Maxwell

Reputation: 35563

Use of coalesce() will supply the dashes without the complexity (and weaknesses) of dynamic sql and multiple selects. Note this is merely a variant of the answer by Gordon Linoff - I simply added the logic for dashes instead of NULL.

select name,
       countif(question = 'Interest' and answer = 'a') as a,
       countif(question = 'Interest' and answer = 'b') as b,
       countif(question = 'Interest' and answer = 'c') as c,
       coalesce(max(case when question = 'gender' then answer end),'-') as gender,
       coalesce(max(case when question = 'DOB' then answer end),'-') as dob
from t
group by name;

Upvotes: 0

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172944

Below is for BigQuery Standard SQL and does not depend on knowing specific questions and generic enough for any values of questions and answers

EXECUTE IMMEDIATE (
  SELECT """
    SELECT name, """ || STRING_AGG("""MAX(IF(answer = '""" || value || """', 1, 0)) AS """ || value, ', ')   
FROM (
  SELECT DISTINCT answer value FROM `project.dataset.table`
  WHERE question = 'Interest' ORDER BY value
)) || (
  SELECT ", " || STRING_AGG("""MAX(IF(question = '""" || value || """', answer, '-')) AS """ || value, ', ')   
FROM (
    SELECT DISTINCT question value FROM `project.dataset.table`
    WHERE question != 'Interest' ORDER BY value
)) || """  
  FROM `project.dataset.table` 
  GROUP BY name
  """;    

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269493

Use conditional aggregation:

select name,
       countif(question = 'Interest' and answer = 'a') as a,
       countif(question = 'Interest' and answer = 'b') as b,
       countif(question = 'Interest' and answer = 'c') as c,
       max(case when question = 'gender' then answer end) as gender,
       max(case when question = 'DOB' then answer end) as dob
from t
group by name;

Note: This returns NULL when the values are missing. To me, that makes more sense than '-', although the logic could be tweaked to return hyphens instead.

Upvotes: 0

Related Questions