Reputation: 679
I've looking for a solution to solve the slow upload speed of images dataset on google colab when i use a connection from GoogleDrive. Using the follow code:
from google.colab import drive
drive.mount('/content/gdrive')
Using this procedure i can upload images and create labels using a my def load_dataset
:
'train_path=content/gdrive/MyDrive/Capstone/Enviroment/cell_images/train'
train_files, train_targets = load_dataset(train_path)
But, as i said, it's very slow, especially because my full dataset is composed by 27560 images.
To solve my problem, i've tried to use this solution.
But now, in order to still use my def
function, after download the .tar
file i wanna extract in a specific folder in the colab enviroment. I found this answer but not solve my problem.
Example:
This is the environment with the test.tar already downloaded.
But i wanna extract the files in the tar file, which structure is train/Uninfected
; train/Parasitized
, to get this:
content
To use the path in def function:
train_path = train_path=content/cell_images/train/'
train_files, train_targets = load_dataset(train_path)
test_path = train_path=content/cell_images/test/'
test_files, test_targets = load_dataset(test_path)
valid_path = train_path=content/cell_images/valid/'
valid_files, valid_targets = load_dataset(valid_path)
I tried to use:
! mkdir -p content/cell_images
and
!tar -xvf 'test.tar' content/cell_images
But it doesn't work.
Does anyone know how to proceed?
Thanks!
Upvotes: 16
Views: 35551
Reputation: 1638
!tar -xvf "cord-19_2021-12-20.tar.gz"
as given here also https://colab.research.google.com/github/sudo-ken/compress-decompress-in-Google-Drive/blob/master/Unrar_Unzip_Rar_Zip_in_GDrive.ipynb
Upvotes: 1
Reputation: 335
Although late answer, but might help others:
shutil.unpack_archive works with almost all archive formats (e.g., “zip”, “tar”, “gztar”, “bztar”, “xztar”) and it's simple:
import shutil
shutil.unpack_archive("filename", "path_to_extract")
Upvotes: 15
Reputation: 145
If your current directory is the default directory, /content
, you can unzip your folder project like this:
%%bash
mkdir foldername
tar -xvf '/content/foldername.tar' -C '/content/'
%%bash
lets you script without using !
at the beginning of each line.
Upvotes: 0
Reputation: 534
Connect to drive,
from google.colab import drive drive.mount('/content/drive')
Check for directory !ls and !pwd
unzip !unzip drive/"My Drive"/images.zip -d destination
Upvotes: 0
Reputation: 29307
To extract the files from the tar archiver to the folder content/cell_images
use the command-line option -C
:
!tar -xvf 'test.tar' -C 'content/cell_images'
Hope this helps!
Upvotes: 21