Oded Badt
Oded Badt

Reputation: 323

BigQuery: Is it posible to modify a table schema by adding a field within a record

BigQuery manual States that it is only possible to add a new field but not modify an existing one. My question is whether it is possible to add an existing field in to a RECORD field. Say the original schema is:

{"type":"RECORD","name":"record","mode":"REPEATED"
 "fields":[
    {"type":"STRING","name":"f1","mode":"NULLABLE"}
  ]
}

And I would like to add f2 so the schema would be:

{"type":"RECORD","name":"record","mode":"REPEATED"
 "fields":[
    {"type":"STRING","name":"f1","mode":"NULLABLE"},
    {"type":"STRING","name":"f2","mode":"NULLABLE"}
  ]
}

Is it possible?

Upvotes: 1

Views: 10867

Answers (3)

Yasin Hassanien
Yasin Hassanien

Reputation: 4045

The only way is to cast the select query using cast() and then save it as bigquery table and then replace delete the old table then save the query again with the original table name

Select cast(section number,INT64) as section_number,section_name from Table.

As you can see the cast here plays the role of parsing the number from string to an INT and the result will automatically change

Upvotes: 0

Kyle
Kyle

Reputation: 1

A work around when using the console would be to query the original table, unnesting the RECORD type and putting it back into a STRUCT, adding the column you want and put in with a placeholder value that matches the type you want.

SELECT STRUCT(a.foo as foo, a.bar as bar, 'hello' as baz) as words, time, id FROM dataset.table, UNNEST(words) as a;

This query result could be saved as a table, then you can go in and do update queries to change the 'hello' to actual text you want stored.

Upvotes: 0

rmesteves
rmesteves

Reputation: 4085

Adding a new field to a STRUCT is not supported in the console but you can add it using the BigQuery CLI as you can see here

In the Console mode:

Adding a new nested field to an exising RECORD column is not currently supported by the classic BigQuery web UI.

Using the BigQuery CLI:

In this option, you can create a new schema with and use bq update project_id:dataset.table schema to update the table.

As you can find in the link:

  1. First, issue the bq show command with the --schema flag and write the existing table schema to a file. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format: project_id:dataset.table. [...]
bq show \
--schema \
--format=prettyjson \
project_id:dataset.table > schema_file
  1. Open the schema file in a text editor. The schema should look like the following. In this example, column3 is a nested repeated column. The nested columns are nested1 and nested2. The fields array lists the fields nested within column3. [...]
  2. Add the new nested column to the end of the fields array. In this example, nested3 is the new nested column. [...]
  3. After updating your schema file, issue the following command to update the table's schema. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format: project_id:dataset. [...]

bq update mydataset.mytable /tmp/myschema.json

Hope it helps

Upvotes: 9

Related Questions