Mauricio
Mauricio

Reputation: 3089

Bigquery error "repeated values added outside of an array" even with value as array

I'm trying to save a RECORD mode REPEATED on BigQuery and I'm getting the error repeated values added outside of an array. I've seen two other questions here regarding this error. But they seem to be the case that the record was not being passed as an array, which is not my case. I have other fields RECORD mode REPEATED which work just fine following the same structure. It seems to be a bug on BigQuery.

The schema I used to create the table:

[
  {
    "fields": [
      {
        "mode": "NULLABLE",
        "name": "id",
        "type": "STRING"
      },
      {
        "mode": "NULLABLE",
        "name": "date",
        "type": "DATETIME"
      }
    ],
    "mode": "REPEATED",
    "name": "config_onboarding",
    "type": "RECORD"
  },
  {
    "mode": "NULLABLE",
    "name": "date_insert",
    "type": "DATETIME"
  },
  {
    "fields": [
      {
        "mode": "NULLABLE",
        "name": "key",
        "type": "STRING"
      },
      {
        "mode": "NULLABLE",
        "name": "value",
        "type": "STRING"
      }
    ],
    "mode": "REPEATED",
    "name": "features",
    "type": "RECORD"
  }
]

When I run the code to insert with only the config_onboarding field it works:

self._client.insert_rows(
    table=table,
    rows=[
        {
            'config_onboarding': [
                {'id': 'template-install', 'date': '2021-07-01 18:43:44.810000'}, 
                {'id': 'run-workflow', 'date': '2021-07-01 18:43:44.810000'}, 
                {'id': 'template-workflow-view', 'date': '2021-07-01 18:43:44.810000'}, 
                {'id': 'automate-workflow-wrapper', 'date': '2021-07-01 19:50:00.901000'}, 
                {'id': 'share-workflow', 'date': '2021-07-01 19:50:00.901000'}
            ],
            'date_insert': datetime.now()
        }
    ],
    selected_fields=self._schema,
    ignore_unknown_values=True,
    skip_invalid_rows=True
)

The result is an empty array meaning there were no errors. Also I can check the entry was added by querying the table.

But when I add the features field which follows the exact same structure, it fails returning an array with errors saying "repeated values added outside of an array":

self._client.insert_rows(
    table=table,
    rows=[
        {
            'config_onboarding': [
                {'id': 'template-install', 'date': '2021-07-01 18:43:44.810000'}, 
                {'id': 'run-workflow', 'date': '2021-07-01 18:43:44.810000'}, 
                {'id': 'template-workflow-view', 'date': '2021-07-01 18:43:44.810000'}, 
                {'id': 'automate-workflow-wrapper', 'date': '2021-07-01 19:50:00.901000'}, 
                {'id': 'share-workflow', 'date': '2021-07-01 19:50:00.901000'}
            ], 
            'features': [
                {'key': 'foo', 'value': 'bar'}, 
                {'key': 'hello', 'value': 'World'}
            ], 
            'date_insert': datetime.now()
        }
    ],
    selected_fields=self._schema,
    ignore_unknown_values=True,
    skip_invalid_rows=True
)

Do you guys have any clue?

UPDATE:

It turned out that It was missing the schema definition for the features field on the schema parameter. That was the issue.

Upvotes: 0

Views: 482

Answers (1)

Prajna Rai T
Prajna Rai T

Reputation: 1820

As per the table schema you have provided, I tried from my end.

enter image description here

You can try the below code to insert values into the table based on your schema.

from google.cloud import bigquery

# Construct a BigQuery client object.
client = bigquery.Client()

# TODO(developer): Set table_id to the ID of table to append to.
table_id = "ProjectId.DatasetId.TableId"

rows_to_insert = [
    {"config_onboarding":[{"id":"template-install","date":"2021-07-01 18:43:44.810000"}],
      "date_insert":"2021-07-02 18:43:44.810000",
      "features":[{"key":"abcd","value":"hundred"}]},
]

errors = client.insert_rows_json(table_id, rows_to_insert)  # Make an API request.
if errors == []:
    print("New rows have been added.")
else:
    print("Encountered errors while inserting rows: {}".format(errors))

Output:

enter image description here

Upvotes: 2

Related Questions