user2692901
user2692901

Reputation: 11

Google BigQuery API, How to set the field type of destinationTable?

We are using BigQuery's Python API, specifically the jobs resource, to run a query on an existing BigQuery table, and to export the results by inserting the resulting dataset in a new BigQuery table (destinationTable).

Is there a way to also update the schema of the newly created table and set a specific datatype? By default, all fields are set to a 'string' type, but we need one of the fields to be 'timestamp'.

Upvotes: 1

Views: 505

Answers (2)

Pentium10
Pentium10

Reputation: 208042

In order to set the field types of the destination table you need to CAST to the new type in your query, as the result set describes the new field type in the destination table.

  1. You need to use simple CAST functions to have numbers/dates like.

SELECT TIMESTAMP(t) AS t FROM (SELECT "2015-01-01 00:00:00" t)

  1. Recently it was introduced the "unflatten" feature for record types, so you can now transfer the whole record to another table while preserving the RECORD structure - for that you'd need to set a destination table (and the desired write disposition), set allowLargeResults =TRUE, and then set Flatten Results= FALSE (see the last post in here where it is explained). Then you can run a query like this to transfer the whole record to the dest table:

SELECT cell.* FROM publicdata:samples.trigrams LIMIT 0;

I'm using tables from the publicdata:samples dataset which is also available to you, so you can run these tests, too. In the above query 'cell' is a record, and if you set Flatten Results=FALSE, you'll see that 'cell' is still a RECORD in your dest table.

  1. You can remove some fields from your record when transferring data to the dest table. Here is the query that demonstrates this (again, you'd need to run it with Flatten Results=FALSE):

SELECT cell.value, cell.volume_count FROM publicdata:samples.trigrams LIMIT 0;

After you run the above query, the 'cell' record will only contain the fields you specified.

  1. You can rename an existing field within a record when transferring data to the dest table:

SELECT cell.value AS cell.newvalue FROM publicdata:samples.trigrams LIMIT 0;

  1. Unfortunately, currently there is no way to add a field to a record, e.g., the following query will create 'url' outside of both 'actor_attributes' and 'repository' records.

SELECT
actor_attributes.blog,
repository.created_at,
repository.url AS actor_attributes.url
FROM publicdata:samples.github_nested
LIMIT 0;

So in order to add a field to a record, you'd need to export your data, process it outside of BigQuery, and then load it back with the new schema.

Upvotes: 2

Felipe Hoffa
Felipe Hoffa

Reputation: 59355

The field types of the destination table will be automatically set. If you need to transform a string to an integer or timestamp, do so in the query.

This will create a destination table with one column (string):

SELECT x FROM (SELECT "1" x)

This will create a destination table with one column (integer):

SELECT INTEGER(x) AS x FROM (SELECT "1" x)

This will create a destination table with one column (timestamp):

SELECT TIMESTAMP(x) AS x FROM (SELECT "2015-10-21 04:29:00" x)

Upvotes: 0

Related Questions