SubZeno
SubZeno

Reputation: 380

Duplicate column names in BigQuery when selecting twice within the same table

I am trying to perform the following query in BigQuery:

SELECT * FROM
  (
    SELECT
      date,
      grob as `grob_r1`
    FROM `mytab_tbl`
    WHERE region = "r1"
  )
  AS TAB_1,
  (
    SELECT
      date,
      grob as `grob_r2`
    FROM `mytab_tbl`
    WHERE region = "r2"
  )
  AS TAB_2
WHERE TAB_1.date = TAB_2.date

It is basically a selection on the same table applying different condition rules, with the aim to then create a view. However, I get the following error: Duplicate column names in the result are not supported. Found duplicate(s): date

Do you have any idea about how to fix this?

Upvotes: 0

Views: 1191

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269503

First, learn to use proper JOIN syntax.

Then, BigQuery allows you to select full records. I usually recommend that approach:

SELECT TAB_1, TAB_2
FROM (SELECT date, grob as grob_r1
      FROM `mytab_tbl`
      WHERE region = 'r1'
     ) TAB_1 JOIN
     (SELECT date, grob as grob_r2
      FROM `mytab_tbl`
      WHERE region = 'r2'
     ) TAB_2
     ON TAB_1.date = TAB_2.date;

In this case, you probably don't need DATE twice. BigQuery supports the standard USING clause, which allows you to do:

SELECT *
FROM (SELECT date, grob as grob_r1
      FROM `mytab_tbl`
      WHERE region = 'r1'
     ) TAB_1 JOIN
     (SELECT date, grob as grob_r2
      FROM `mytab_tbl`
      WHERE region = 'r2'
     ) TAB_2
     USING (date);

This returns three columns.

Another handy alternative -- which is not needed in this case -- is EXCEPT:

SELECT TAB_1.*,
       TAB_2.* EXCEPT (date)
FROM (SELECT date, grob as grob_r1
      FROM `mytab_tbl`
      WHERE region = 'r1'
     ) TAB_1 JOIN
     (SELECT date, grob as grob_r2
      FROM `mytab_tbl`
      WHERE region = 'r2'
     ) TAB_2
     ON TAB_1.date = TAB_2.date;

Upvotes: 1

Francesco Galletta
Francesco Galletta

Reputation: 718

Mainly, your SELECT * will just get all columns from TAB_1 and TAB_2, and in both cases, you have the date field.

SELECT TAB_1.*, TAB_2.grob_r2 FROM
  (
    SELECT
      date,
      grob as `grob_r1`
    FROM `mytab_tbl`
    WHERE region = "r1"
  )
  AS TAB_1,
  (
    SELECT
      date,
      grob as `grob_r2`
    FROM `mytab_tbl`
    WHERE region = "r2"
  )
  AS TAB_2
WHERE TAB_1.date = TAB_2.date

In your case, your join condition is an equality, but if you wanted to get the date from TAB_2 as well in your results, or if you want to get all columns from TAB_2, whatever they are, you could do something like

SELECT TAB_1.*, TAB_2.* EXCEPT(date), TAB_2.date AS tab_2_date FROM ...

edit: To help you with the FULL JOIN question.

SELECT TAB_1.*, TAB_2.grob_r2 FROM
  (
    SELECT
      date,
      grob as `grob_r1`
    FROM `mytab_tbl`
    WHERE region = "r1"
  )
  AS TAB_1 FULL JOIN 
  (
    SELECT
      date,
      grob as `grob_r2`
    FROM `mytab_tbl`
    WHERE region = "r2"
  )
  AS TAB_2
ON TAB_1.date = TAB_2.date

Upvotes: 1

Related Questions