Reputation: 3089
I'm trying to save a RECORD mode REPEATED on BigQuery and I'm getting the error repeated values added outside of an array
. I've seen two other questions here regarding this error. But they seem to be the case that the record was not being passed as an array, which is not my case. I have other fields RECORD mode REPEATED which work just fine following the same structure. It seems to be a bug on BigQuery.
The schema I used to create the table:
[
{
"fields": [
{
"mode": "NULLABLE",
"name": "id",
"type": "STRING"
},
{
"mode": "NULLABLE",
"name": "date",
"type": "DATETIME"
}
],
"mode": "REPEATED",
"name": "config_onboarding",
"type": "RECORD"
},
{
"mode": "NULLABLE",
"name": "date_insert",
"type": "DATETIME"
},
{
"fields": [
{
"mode": "NULLABLE",
"name": "key",
"type": "STRING"
},
{
"mode": "NULLABLE",
"name": "value",
"type": "STRING"
}
],
"mode": "REPEATED",
"name": "features",
"type": "RECORD"
}
]
When I run the code to insert with only the config_onboarding
field it works:
self._client.insert_rows(
table=table,
rows=[
{
'config_onboarding': [
{'id': 'template-install', 'date': '2021-07-01 18:43:44.810000'},
{'id': 'run-workflow', 'date': '2021-07-01 18:43:44.810000'},
{'id': 'template-workflow-view', 'date': '2021-07-01 18:43:44.810000'},
{'id': 'automate-workflow-wrapper', 'date': '2021-07-01 19:50:00.901000'},
{'id': 'share-workflow', 'date': '2021-07-01 19:50:00.901000'}
],
'date_insert': datetime.now()
}
],
selected_fields=self._schema,
ignore_unknown_values=True,
skip_invalid_rows=True
)
The result is an empty array meaning there were no errors. Also I can check the entry was added by querying the table.
But when I add the features
field which follows the exact same structure, it fails returning an array with errors saying "repeated values added outside of an array"
:
self._client.insert_rows(
table=table,
rows=[
{
'config_onboarding': [
{'id': 'template-install', 'date': '2021-07-01 18:43:44.810000'},
{'id': 'run-workflow', 'date': '2021-07-01 18:43:44.810000'},
{'id': 'template-workflow-view', 'date': '2021-07-01 18:43:44.810000'},
{'id': 'automate-workflow-wrapper', 'date': '2021-07-01 19:50:00.901000'},
{'id': 'share-workflow', 'date': '2021-07-01 19:50:00.901000'}
],
'features': [
{'key': 'foo', 'value': 'bar'},
{'key': 'hello', 'value': 'World'}
],
'date_insert': datetime.now()
}
],
selected_fields=self._schema,
ignore_unknown_values=True,
skip_invalid_rows=True
)
Do you guys have any clue?
It turned out that It was missing the schema definition for the features
field on the schema parameter. That was the issue.
Upvotes: 0
Views: 482
Reputation: 1820
As per the table schema you have provided, I tried from my end.
You can try the below code to insert values into the table based on your schema.
from google.cloud import bigquery
# Construct a BigQuery client object.
client = bigquery.Client()
# TODO(developer): Set table_id to the ID of table to append to.
table_id = "ProjectId.DatasetId.TableId"
rows_to_insert = [
{"config_onboarding":[{"id":"template-install","date":"2021-07-01 18:43:44.810000"}],
"date_insert":"2021-07-02 18:43:44.810000",
"features":[{"key":"abcd","value":"hundred"}]},
]
errors = client.insert_rows_json(table_id, rows_to_insert) # Make an API request.
if errors == []:
print("New rows have been added.")
else:
print("Encountered errors while inserting rows: {}".format(errors))
Output:
Upvotes: 2