VSR
VSR

Reputation: 97

I want to join two tables with a common column in Big query?

To join the tables, I am using the following query.

SELECT *
FROM(select user as uservalue1 FROM [projectname.FullData_Edited]) as FullData_Edited 
JOIN (select user as uservalue2 FROM [projectname.InstallDate]) as InstallDate 
ON FullData_Edited.uservalue1=InstallDate.uservalue2;

The query works but the joined table only has two columns uservalue1 and uservalue2. I want to keep all the columns present in both the table. Any idea how to achieve that?

Upvotes: 2

Views: 9750

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172944

#legacySQL
SELECT <list of fields to output>
FROM [projectname:datasetname.FullData_Edited] AS FullData_Edited
JOIN [projectname:datasetname.InstallDate] AS InstallDate
ON FullData_Edited.user = InstallDate.user

or (and preferable)

#standardSQL
SELECT <list of fields to output>
FROM `projectname.datasetname.FullData_Edited` AS FullData_Edited
JOIN `projectname.datasetname.InstallDate` AS InstallDate
ON FullData_Edited.user = InstallDate.user

Note, using SELECT * in such cases lead to Ambiguous column name error, so it is better to put explicit list of columns/fields you need to have in your output

The way around it is in using USING() syntax as in example below.
Assuming that user is the ONLY ambiguous field - it does the trick

#standardSQL
SELECT *
FROM `projectname.datasetname.FullData_Edited` AS FullData_Edited
JOIN `projectname.datasetname.InstallDate` AS InstallDate
USING (user)

For example:

#standardSQL
WITH `projectname.datasetname.FullData_Edited` AS (
  SELECT 1 user, 'a' field1
),
`projectname.datasetname.InstallDate` AS (
  SELECT 1 user, 'b' field2
)
SELECT *
FROM `projectname.datasetname.FullData_Edited` AS FullData_Edited
JOIN `projectname.datasetname.InstallDate` AS InstallDate
USING (user)

returns

user    field1  field2   
1       a       b    

whereas using ON FullData_Edited.user = InstallDate.user gives below error

Error: Duplicate column names in the result are not supported. Found duplicate(s): user

Upvotes: 6

Gordon Linoff
Gordon Linoff

Reputation: 1269493

Don't use subqueries if you want all columns:

SELECT *
FROM [projectname.FullData_Edited] as FullData_Edited JOIN
     [projectname.InstallDate] as InstallDate 
     ON FullData_Edited.uservalue1 = InstallDate.uservalue2;

You may have to list out the particular columns you want to avoid duplicate column names.

While you are at it, you should also switch to standard SQL.

Upvotes: 1

Related Questions