Reputation: 749
I have been constantly trying to stream data to BigQuery using python google-cloud package from google.cloud import bigquery
.
What I have observed is that it is refusing to insert few rows saying that
[{u'debugInfo': u'', u'reason': u'invalid', u'message': u'no such field.', u'location': u'user.new_user'}]}]
But, I could see that column in the schema table.schema
[(u'USER', u'record', u'NULLABLE', None, (SchemaField(u'new_user', u'string', u'NULLABLE', None, ())))]
Is this because, i am trying to stream & update at a more rate than mentioned in BigQuery docs ?
I tried to run the same thing on terminal and this worked with no errors. This is happening when I try to stream at a more higher rate.
For now, I am using as
self.bigquery_client.create_rows_json(table, batched_event,retry=bigquery.DEFAULT_RETRY.with_deadline(10),skip_invalid_rows=True, ignore_unknown_values=True)
Upvotes: 4
Views: 892
Reputation: 4384
If you're modifying the schema while using streaming, the streaming system doesn't immediately pick up the schema changes. More info:
https://cloud.google.com/bigquery/troubleshooting-errors#metadata-errors-for-streaming-inserts
Upvotes: 3