Reputation: 1351
As the question states, I'm trying to figure out how I can extract a .tar.gz
file that is stored in a GCS Bucket from a Google Colab notebook.
I am able to connect to my bucket via:
auth.authenticate_user()
project_id = 'my-project'
!gcloud config set project {project_id}
However, when I try running a command such as:
!gsutil tar xvzf my-bucket/compressed-files.tar.gz
I get an error. I know that gsutil
probably has limited functionality and maybe isn't meant to do what I'm trying to do, so is there a different way to do it?
Thanks!
Upvotes: 1
Views: 10749
Reputation: 327
You can create a Dataflow process from a template to decompress a file in your Bucket The template is called Bulk decompress Cloud Storage files
You have to specify file location, output location, failure log, and tmp location
Upvotes: 2
Reputation: 323
This worked for me. I'm new to colab and python itself so I'm not certain this is the solution.
!sudo tar -xvf my-bucket/compressed-files.tar.gz
Upvotes: 0
Reputation: 806
Google Cloud Storage - GCS does not natively support unpacking a tar archive. You will have to do this yourself either on your local machine or from a Compute Engine VM, for instance
Upvotes: 6