Reputation: 69
How can I import or Download RVL-CDIP dataset fastly?
I have already done a lot of research to get any type of link so that I can directly import it in my Jupyter notebook but I cannot get anything.
Upvotes: 1
Views: 1026
Reputation: 91
from datasets import load_dataset
dataset = load_dataset("aharley/rvl_cdip")
Upvotes: 1
Reputation: 11
You can try to load the RVL-CDIP dataset from the Hugging Face Datasets Hub using TensorFlow, since the RVL-CDIP dataset is available on TensorFlow Datasets (TFDS).
import tensorflow_datasets as tfds
# Load the RVL-CDIP dataset
ds = tfds.load('huggingface:rvl_cdip', split='train', shuffle_files=True)
Once loaded, the ds
object will be used for further processing and training machine learning models.
Upvotes: 0
Reputation: 41
This will download the dataset and save the file under name 'rvl-cdip' in your notebook folder
!wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=0Bz1dfcnrpXM-MUt4cHNzUEFXcmc' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=0Bz1dfcnrpXM-MUt4cHNzUEFXcmc" -O rvl-cdip && rm -rf /tmp/cookies.txt
Upvotes: 3