J.Fratzke
J.Fratzke

Reputation: 1465

Multiple Joins And Writing to Destination Table with BigQuery

I have the following query that works fine if I DON'T set a destination table.

SELECT soi.customer_id
, p.department
, p.category
, p.subcategory
, p.tier1
, p.tier2
, pc.bucket as categorization
, SUM(soi.price) as demand
, COUNT(1) as cnt
FROM store.sales_item soi 
INNER JOIN datamart.product p ON (soi.product_id = p.product_id)
INNER JOIN daily_customer_fact.dcf_product_categorization pc 
ON (p.department = pc.department
    AND p.category = pc.category 
    AND p.subcategory = pc.subcategory 
    AND p.tier1 = pc.tier1 
    AND p.tier2 = pc.tier2)
    WHERE DATE(soi.created_timestamp) < current_date()
    GROUP EACH BY 1,2,3,4,5,6,7 LIMIT 10

However, if I set a destination table, it fails with

Error: Ambiguous field name 'app_version' in JOIN. Please use the table qualifier before field name.

That column exists on the store.sales_item table, but I'm not selecting nor joining to that column.

Upvotes: 2

Views: 120

Answers (1)

Danny Kitt
Danny Kitt

Reputation: 3251

I've seen this error message before, and it points to the following:

  • Your query job when specifying a destination table is setting flattenResults to false.
  • Both of the store.sales_item and datamart.product tables contain a field named "app_version".

If so, I recommend looking at this answer: https://stackoverflow.com/a/28996481/4001094

As well as this issue report: https://code.google.com/p/google-bigquery/issues/detail?id=459


In your case, you should be able to make your query succeed by doing something like the following, using suggestion #3 from the answer linked above. I'm unable to test it as I don't have access to your source tables, but it should be close to working with flattenResults set to false.

SELECT soi_and_p.customer_id
, soi_and_p.department
, soi_and_p.category
, soi_and_p.subcategory
, soi_and_p.tier1
, soi_and_p.tier2
, pc.bucket as categorization
, SUM(soi_and_p.price) as demand
, COUNT(1) as cnt
FROM 
  (SELECT soi.customer_id AS customer_id
   , p.department AS department
   , p.subcategory AS subcategory
   , p.tier1 AS tier1
   , p.tier2 AS tier2
   , soi.price AS price
   , soi.created_timestamp AS created_timestamp
  FROM store.sales_item soi 
  INNER JOIN datamart.product p ON (soi.product_id = p.product_id)
  ) as soi_and_p
INNER JOIN daily_customer_fact.dcf_product_categorization pc 
ON (soi_and_p.department = pc.department
    AND soi_and_p.category = pc.category 
    AND soi_and_p.subcategory = pc.subcategory 
    AND soi_and_p.tier1 = pc.tier1 
    AND soi_and_p.tier2 = pc.tier2)
    WHERE DATE(soi_and_p.created_timestamp) < current_date()
    GROUP EACH BY 1,2,3,4,5,6,7 LIMIT 10

Upvotes: 1

Related Questions