Bobbyphtr
Bobbyphtr

Reputation: 115

Unzip failed to finish in Google Colab

So I try to train an autoencoder model but have difficulty on extracting large zipfile and rarfile in Google Drive. its a 3GB zipfile containing 500 dirs of images and a5GB rarfile containing 1.7 million images.

I try to ran this code in Colab and it finished extracting my 3 GB zipfile after 6 hours.

!unzip -q drive/"My Drive"/"Colab Notebooks"/"Dataset"/"Dataset_Final_500"/syn_train_3.zip -d drive/"My Drive"/"Colab Notebooks"/"Dataset"/"Dataset_Final_500"/ 

but when i checked, it only creates 86 out of 500 directories in my google drive. Why does it happen and how do I continue without re-extract it all over again. Any idea on extracting my 5GB rarfile to google drive?

Any help would be a blessing :)

Upvotes: 0

Views: 1508

Answers (1)

Bobbyphtr
Bobbyphtr

Reputation: 115

As @BobSmith said, I move all of my dataset to the google colab's local disk first and extract all of it using :

!unzip -u -q /content/syn_train_3.zip

and for rar using unrar

!unrar e real_train_500_2.rar train_dir

the extraction is proved faster. and I split the dataset to .npy files and save it to the drive again.

I found that Google Colab uses Google Drive File Stream like Backup and Sync in your desktop. It would be painful to wait the dataset synced between Colab and Drive.

Careful, don't let the "/drive/My Drive" in Google Colab fools you that it already saved to Google Drive, it needs time to sync!.

Upvotes: 1

Related Questions