Reputation: 560
I need to download Flicker YFCC-100M dataset. I have amazon AWS account but could not figure out way to download dataset. There is blog but it is not clear for me to download the dataset
With flicker API, I can download images but that will not be YFCC100M.
Here is one suggestion but awscli could not installed on my system.
>> sudo apt install awscli
>> ..........
>> Error: Unable to correct problems, you have held broken packages.
Is there any easy way to get this dataset downloaded.
Upvotes: 1
Views: 2717
Reputation: 5148
You need to register on the Yahoo Webscope website and add this dataset to the "Cart". After submitting your request for the dataset, you should receive an email with instructions. I am reproducing a part of this email, after scrubbing out some of the details and privileged information.
- Download and install s3cmd from http://s3tools.org/download (or using an appropriate package manager for your platform)
- Run 's3cmd --configure' and enter your access key and secret ( available via XXXXXXXX <-- the actual link will be in their email ). Here you can also specify additional options, such as enabling encryption during transfer, and enabling a proxy.
- Run 's3cmd ls s3://yahoo-webscope/XXXXXXX/' to view the S3 objects for I3 - Yahoo Flickr Creative Commons 100M (14G) (Hosted on AWS)
- Run 's3cmd get --recursive s3://yahoo-webscope/XXXXXXX/' to download a local copy of I3 - Yahoo Flickr Creative Commons 100M (14G) (Hosted on AWS)
It should be easy for you to follow these steps and get the dataset. I agree, the steps are not very transparent in their website!
Upvotes: 0
Reputation: 122
This assumes that you already have pip and either Python 2.6.5+ or Python 3.3+ installed on your system. If you want to install awscli, you'll need to run
pip install awscli --upgrade --user
You can read more about installing the AWS Command Line Interface (CLI) here.
In addition, i think this link would let you gain access to the dataset that you are looking for.
Upvotes: 1