Reputation: 1
I have created a flow via Dataflow to move data from MongoDB to bigquery (python script) The pipeline has been working properly for the last month even with nested fields. The pipeline process is as follows: check bigquery final table schema > ReadFromMongo passing projection argument based on the previous schema > add some datetime fields > WriteToBigquery.
The job start failing after moving data between tables within ARRAYS of STRUCT type fields.
The field has the following schema:
responses ARRAY<STRUCT<
step STRING,
question STRING,
questionKind STRING,
questionLabel STRING,
questionFriendlyName STRING,
reviewables ARRAY<STRING>,
tags ARRAY<STRING>,
choiceTags ARRAY<STRING>,
maxValue STRING,
score STRING,
value STRING
>>
And some rows tried to be inserted has this form:
{"responses": []}
{"responses": [{"step": "66e32006732e5b66417c2b39", "question": "66e318f7732e5b66417be614", "questionKind": "star", "questionLabel": "Were you treated kindly?", "questionFriendlyName": "Staff", "reviewables": [], "tags": [], "choiceTags": [], "score": "100", "maxValue": "5", "value": "5"}]}
{"responses": [{"step": "66e32006732e5b66417c2b39", "question": "66e318f7732e5b66417be614", "questionKind": "star", "questionLabel": "Were you treated kindly?", "questionFriendlyName": "Staff", "reviewables": [], "tags": [], "choiceTags": [], "score": "100", "maxValue": "5", "value": "5"}, {"step": "66e32006732e5b66417c2b3a", "question": "66e318f9732e5b66417be941", "questionKind": "star", "questionLabel": "Were you listened to?", "questionFriendlyName": "Listened to", "reviewables": [], "tags": [], "choiceTags": [], "score": "100", "maxValue": "5", "value": "5"}]}
I checked so many webs before but found nothing. I find these rows with the proper structure, matching the table field schema but every time I run the job it fails.
Can anyone help me with this? Thanks :)
I tried managing empty arrays of structs, converting into JSON before inserting, check both schemas... nothing worked.
Upvotes: 0
Views: 29