bcli4d
bcli4d

Reputation: 51

Getting an "Invalid schema update. Cannot add fields" from BQ, with ALLOW_FIELD_ADDITION set in the configuration

The following python code snippet produces the error in the title:

job_config = bigquery.QueryJobConfig()
# Set the destination table
table_ref = client.dataset(args.bq_dataset_id).table(args.bq_cum_table)
job_config.destination = table_ref
job_config.write_disposition = 'WRITE_APPEND'
job_config.schemaUpdateOptions = ['ALLOW_FIELD_ADDITION', 'ALLOW_FIELD_RELAXATION']

# Start the query, passing in the extra configuration.
query_job = client.query(
    sqlstr,
    # Location must match that of the dataset(s) referenced in the query
    # and of the destination table.
    location="US",
    job_config=job_config,
)  # API request - starts the query

query_job.result()  # Wait for the query to finish

I'm working from the "Writing query results to a destination table" here, but with the additional 'ALLOW_FIELD_ADDITION' and 'ALLOW_FIELD_RELAXATION' config settings to (I thought) enable adding additional columns to the target table. The schema of the data which I am trying to append probably is not a superset of the schema of the target table. And I have no control over the order of items resulting from the query, in case either if those considerations matter.

I'm using standard SQL. The query is like:

SELECT `StudyDate`, `SeriesDate`, `AcquisitionDate` FROM `project.dataset.table` LIMIT 1

This is my first use of the BQ Python library so I won't be surprised if I've overlooked something.

Thanks.

Upvotes: 5

Views: 10952

Answers (1)

llompalles
llompalles

Reputation: 3176

You are doing it right except for one little mistake.

job_config.schemaUpdateOptions is a field that doesn’t exist in the BigQuery Python API. Change it for job_config.schema_update_options and your code will work as expected.

Also, here are several examples showing how to modify the BigQuery table schema using Python.

Upvotes: 9

Related Questions