Reputation: 41
I am building a data pipeline from PubSub to Beam (Direct/Dataflow Runner) to Big Query. Today we started to run into issues where beam IO BigQuery connector stopped creating tables automatically and produced no error messages (Logging level set to DEBUG).
Here is a snippet of what the BigQuery PTransform looks like:
beam.io.WriteToBigQuery(
table=bq_table,
schema=to_bq_schema(table),
write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED)
Note that bq_table and schema should be correct. We have tried even reducing the schema to single column.
Upvotes: 0
Views: 418
Reputation: 41
I was able to resolve the issue. Turned out my timestamps had +00000 timezone awareness. It was very difficult to debug this because it silently failed and is not trivial to attach a debugger on the runner.
Upvotes: 3