NikM
NikM

Reputation: 173

BigQuery: Arrange columns from 2 table to get a single table but only get unique rows from table2

I have two tables and in the output, I want a single table with rows from both the tables one below the other.

Refer image. Yellow cells in table2 should not be present in the output, since they are present in table1, as highlighted. So only sr.no. 2,4,5,7 from table-2 should be present in output table.

Example input and desired output

Can someone suggest solution?

The logic I tried is UNION to first get all rows one below the other. However I'm not getting logic to exclude matching rows from table2 after UNION. I tried multiple options as commented below in my code:

  with a AS (
  SELECT
    email AS leademail,
    website AS leadwebsite,
    productname AS leadproductname,
    source AS leadsource,
    detail AS leaddetail,
    leaddate
  FROM
    `table1`),
  b AS (
  SELECT
    email AS contactemail,
    website AS contactwebsite,
    productname AS contactproductname,
    source AS contactsource,
    detail AS contactdetail,
    contactdate
  FROM
    `table2`)
SELECT
  *
FROM
  a union all
SELECT
  *
FROM
  b
  --option1 except distinct select * from b
  --option2 union all select * from b
  --option3 except distinct select * from b
  --option4 left join
  --(select * from b where contactemail not in (select leademail from a))
ON
  leademail=contactemail
  AND leadproductname=contactproductname

Upvotes: 0

Views: 61

Answers (1)

Pol Ortiz
Pol Ortiz

Reputation: 462

I was able to perform this query by using the UNION ALL operator to append filtered rows from the table2 to the table1 and I used the operator NOT IN to filter the rows from table2.

SELECT
  *
FROM
  table1
UNION ALL (
  SELECT table2.*
  FROM
    table2
  WHERE
    table2.email NOT IN (SELECT email FROM table1)
    OR table2.productname NOT IN (SELECT productname FROM table1)
)

Also, I made the following executable query based on your example data so more people can play with:

WITH table1 AS (
  SELECT
    "[email protected]" AS email,
    "abc.com" AS website,
    "msoffice" AS productname,
    "abc" AS source,
    "one" AS detail,
    "12-Dec-19" AS leaddat
  UNION ALL SELECT "[email protected]","pqr1.com","chrome","pqr1","two","12-Dec-19"
  UNION ALL SELECT "[email protected]","xyz.com","iphone","xyz",NULL,"12-Dec-19"
  UNION ALL SELECT "[email protected]","zzz.com","macbook","zzz","three","12-Dec-19"
  UNION ALL SELECT "[email protected]","xyz1.com","itunes","xyz1",NULL,"12-Jan-20"
  UNION ALL SELECT "[email protected]",NULL,"googlecloud","xyz2",NULL,"12-Jan-20"
  UNION ALL SELECT "[email protected]","123.com","yahoomail","123","xyz","12-Jan-20"
), table2 as (
  SELECT
    NULL AS email,
    "abc.com" AS website,
    "msoffice" AS productname,
    "abc" AS source,
    "one" AS detail,
    "11-Nov-19" AS leaddat
  UNION ALL SELECT "[email protected]","pqr.com","playstore","pqr1","two","12-Jan-20"
  UNION ALL SELECT "[email protected]","123.com",NULL,"123","xyz","12-Oct-19"
  UNION ALL SELECT "[email protected]","abc1.com",NULL,"xyz",NULL,"12-Jan-20"
  UNION ALL SELECT "[email protected]",NULL,"ipod","zzz","three","12-Jan-20"
  UNION ALL SELECT NULL,NULL,"googlecloud","xyz2",NULL,"12-Jan-20"
  UNION ALL SELECT "[email protected]",NULL,"yahoomail",NULL,NULL,"12-Jan-20"
)

SELECT
  *
FROM
  table1
UNION ALL (
  SELECT table2.*
  FROM
    table2
  WHERE
    table2.email NOT IN (SELECT email FROM table1)
    OR table2.productname NOT IN (SELECT productname FROM table1)
)

Hope it helps.

Upvotes: 2

Related Questions