Anubis05
Anubis05

Reputation: 1274

JSON formatting Error when loading into Google Big Query

I am trying to load the following data in Big Query from PUBSUB using the built in dataflow template:

{
    "current_speed": "19.09",
    "_east": "-87.654561",
    "_last_updt": "2018-07-17 15:50:54.0",
    "_region_id": "1",
    "_north": "42.026444",
    "_south": "41.997946",
    "region": "Rogers Park - West Ridge",
    "_west": "-87.709645",
    "_description": "North of Devon. Kedzie to Lake Shore"
}

But I keeping getting this error:

"Error while reading data, error message: Failed to parse JSON: Unexpected end of string; Unexpected end of string; Expected key"

I actually need to load the larger dataset which looks like this:

 [{
    "current_speed": "19.09",
    "_east": "-87.654561",
    "_last_updt": "2018-07-17 15:50:54.0",
    "_region_id": "1",
    "_north": "42.026444",
    "_south": "41.997946",
    "region": "Rogers Park - West Ridge",
    "_west": "-87.709645",
    "_description": "North of Devon. Kedzie to Lake Shore"
}, {
    "current_speed": "25.23",
    "_east": "-87.747456",
    "_last_updt": "2018-07-17 15:50:54.0",
    "_region_id": "2",
    "_north": "42.0190998",
    "_south": "41.960669",
    "region": "Far North West",
    "_west": "-87.84621",
    "_description": "North of Montrose. East River to Cicero"
}

]

But there I get this error:

Error while reading data, error message: Failed to parse JSON: No object found when new array is started.; BeginArray returned false; Parser terminated before end of string

What am I doing wrong here?

Upvotes: 18

Views: 44809

Answers (2)

Felipe Hoffa
Felipe Hoffa

Reputation: 59325

To convert JSON to new line delimited JSON (which is the format that BigQuery ingests) you can use jq:

$ cat a.json 
[{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
}
]


$ cat a.json | jq -c '.[]'
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}

(see https://stackoverflow.com/a/51301075/132438)

Upvotes: 32

Hua Zhang
Hua Zhang

Reputation: 1551

Yes, BigQuery only accepts new-line delimited JSON, which means one complete JSON object per line. Before you merge the object to one line, BigQuery reads "{", which is start of an object, and expects to read a key, but the line ended, so you see the error message "expected key".

For multiple JSON objects, just put them one in each line. Don't enclose them inside an array. BigQuery expects each line to start with an object, "{". If you put "[" as the first character, you will see the second error message which means BigQuery reads an array but not inside an object.

Upvotes: 28

Related Questions