Roger
Roger

Reputation: 437

Load thousands of JSON files into BigQuery

I have around 10,000 JSON files, and I want to load them into BigQuery. As BQ only accept ndJSON, I spent hours searching for a solution, but I can't find a easy and clean way to convert all the files to ndJSON.

I tested cat test.json | jq -c '.[]' > testNDJSON.json and it works well to convert a file, but how to convert all the files at once?

Right now, my ~10k files are on a GCP bucket, and weight ~5go.

Thanks!

Upvotes: 0

Views: 940

Answers (2)

Parth Mehta
Parth Mehta

Reputation: 1927

Did you come across Dataprep in your search? Dataprep can read data from Cloud Storage, help you format the data and insert data to BigQuery for you.

Alternatively, you can use Cloud DataFlow I/O transform to deal with this automatically. See the link below for reference.

Hope this helps.

Upvotes: 1

Enrique Zetina
Enrique Zetina

Reputation: 835

my suggestion is to use a Google-provided Cloud Dataflow template to transfer your files to BQ, you can use the one called Cloud Storage Text to BigQuery , it's important to consider the UDF function to transform your JSON files.

Upvotes: 0

Related Questions