Loading JSON data to multiple tables in BigQuery

Question

My JSON looks like:

{
    "Key1":"Value1","Key2":"Value2","Key3":"Value3","List1":
    [
        {
            "SubKey1":"SubValue1_1","SubKey2":"SubValue1_2","SubKey3":"SubValue1_3"
        },
        {
            "SubKey1":"SubValue2_1","SubKey2":"SubValue2_2","SubKey3":"SubValue2_3"
        },
        {
            "SubKey1":"SubValue3_1","SubKey2":"SubValue3_2","SubKey3":"SubValue3_3"
        }
    ]
}

It loads in a single BigQuery table like this:

But I want my data to load in 2 seperate tables like:

and

Please guide on what I should do.

C.Georgiadis · Accepted Answer

It would be possible if you can use the bq command line.

Presuming that your JSON file (my_json_file.json) lives in a GCS bucket (e.g. my_gcs_bucket) and the destination table my_dataset.my_destination_table, you can run the following command

bq load --ignore_unknown_values --source_format=NEWLINE_DELIMITED_JSON my_dataset.my_destination_table "gs://my_gcs_bucket/my_json_file.json" ./schema.json

where in the schema.json, you have already selected the schema of the destination table. For instance, the following two schemas will load the data as expected:

schema_1.json

[
  {
    "mode": "NULLABLE",
    "name": "Key1",
    "type": "STRING"
  },
  {
    "mode": "NULLABLE",
    "name": "Key2",
    "type": "STRING"
  },
  {
    "mode": "NULLABLE",
    "name": "Key3",
    "type": "STRING"
  }
]

and schema_2.json

[
  {
    "mode": "NULLABLE",
    "name": "Key1",
    "type": "STRING"
  },
  {
    "fields": [
      {
        "mode": "NULLABLE",
        "name": "SubKey1",
        "type": "STRING"
      },
      {
        "mode": "NULLABLE",
        "name": "SubKey2",
        "type": "STRING"
      },
      {
        "mode": "NULLABLE",
        "name": "SubKey3",
        "type": "STRING"
      }
    ],
    "mode": "REPEATED",
    "name": "List1",
    "type": "RECORD"
  }
]

and then

bq load --ignore_unknown_values --source_format=NEWLINE_DELIMITED_JSON my_dataset.my_destination_table_1 "gs://my_gcs_bucket/my_json_file.json" ./schema_1.json

bq load --ignore_unknown_values --source_format=NEWLINE_DELIMITED_JSON my_dataset.my_destination_table_2 "gs://my_gcs_bucket/my_json_file.json" ./schema_2.json

Will load two different tables based on the same JSON file

Loading JSON data to multiple tables in BigQuery

Answers (2)

Related Questions