Reputation: 471
I have created a race table in my cloud project named race. The table contains one integer field and one string field:
Then I try to perform a join query with publicdata.samples.natality and my race table:
SELECT COUNT(*),natality.year FROM [publicdata:samples.natality] natality INNER JOIN [regal-cider-369:ab.race] race ON natality.child_race = race.id GROUP BY 2 LIMIT 50
and I get the following error:
Fields child_race and id have different types int32 and int64, and therefore can't be compared as join attributes
It can be a limitation in the Bigquery engine, and it's ok.
However, the problem is that I can't find any place in the API where I can specify whether I want a field to be int32 or int64 when I create my race table (The only integer data type I've seen up to now in the documentation are STRING, INTEGER, FLOAT, BOOLEAN, TIMESTAMP or RECORD (https://developers.google.com/bigquery/docs/reference/v2/tables#schema.fields.type).
Did I miss something? or is the "natality" sample table using a data type which is not available in the API?
Upvotes: 2
Views: 5333
Reputation: 26617
All integers in BigQuery should be int64s (BigQuery calls them INTEGER). If they're not, it is a bug. It is possible that some of the sample tables have values where the underlying type is int32, however.
Are you only seeing this on the natality table? If so, we should fix it.
A workaround is to coerce the values to an INTEGER ... you'll need to use a subquery, however. As in:
SELECT COUNT(*), natality.year FROM (
SELECT year, INTEGER(child_race) as child_race
FROM [publicdata:samples.natality]) AS natality
INNER JOIN
(SELECT INTEGER(id) as id FROM [regal-cider-369:ab.race]) AS race
ON natality.child_race = race.id
GROUP BY 2 LIMIT 50
Upvotes: 3