Vikram Tiwari
Vikram Tiwari

Reputation: 3895

How to add "array of strings" as a schema value for BigQuery

I want to store an array of string values in a single column in BigQuery. In order to do that I need to first create a table with the proper schema. The schema identifier for the array is ARRAY<T> where T is the data type.

For example: T replaced by INT64 as ARRAY<INT64> enables the storage of 64-bit integer array in that column. How should I get this same effect but for storing string values? I have already tried STRING, VARCHAR and CHAR.

Just mention: I am using latest google-cloud python package

Documentation Reference for BigQuery Schema

Upvotes: 11

Views: 19544

Answers (2)

Vikram Tiwari
Vikram Tiwari

Reputation: 3895

In order to add a column with array data, you need to define that column's mode as REPEATED. Hence a sample schema ends up being:

{
  'name': 'array_of_strings',
  'type': 'STRING',
  'mode': 'REPEATED'
},{
  'name': 'array_of_floats',
  'type': 'FLOAT',
  'mode': 'REPEATED'
}

This makes the field hold array values.

NOTE: You should be aware that if this is the schema of the table then you can not use CSV import functionality of big query since this is a limitation of CSV file format. You will need to use either json or avro formats.

Reference to the GitHub issue

Upvotes: 27

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172944

In order to do that I need to first create a table with the proper schema.

just run below in Web UI with new destination table - to create needed schema

#standardSQL
SELECT ARRAY<STRING>[] AS array_of_strings

Upvotes: 4

Related Questions