chemmy
chemmy

Reputation: 25

Edit Multiple Files in GCP Storage Bucket

I have multiple JSON files in a GCS bucket that I need to export to BigQuery.

They are not newline-delimited I need to edit the files, and I'm looking to use the cloud shell to perform this on a large scale, since data dumps like this will happen often.

I was thinking it should be something along the lines of

gsutil cat gs://triad_data/file_testing/Appointment.json | jq -c '.[]' > apptNDJSON.json

but I have no clue how to pipe this for all items in my storage bucket. Is this the correct line of thought or is an operation like this not possible in GCP?

Upvotes: 1

Views: 1833

Answers (1)

Louis C
Louis C

Reputation: 655

Cloud Storage is not a file system. You can only Write, Delete and Read. No update, no move. You can activate the versioning for creating a new version, but you can't update the existing blob on GCS directly.

On GCS and specifically with gsutil, the command "gsutil cat" is made to concatenate object content to stdout, in other words to show the content of a file on the console.

A command that is more similiar to what you're looking for would be "gsutil compose" but this joins the content of File_A and File_B into a new File_C which is not what you are looking for.

You will need to code the app to download the file edit the content and then upload the file again or something similar

Upvotes: 1

Related Questions