Mohbat Tharani
Mohbat Tharani

Reputation: 560

Download YFCC-100M dataset

I need to download Flicker YFCC-100M dataset. I have amazon AWS account but could not figure out way to download dataset. There is blog but it is not clear for me to download the dataset

With flicker API, I can download images but that will not be YFCC100M.

Here is one suggestion but awscli could not installed on my system.

>> sudo apt install awscli 
>> .......... 
>> Error: Unable to correct problems, you have held broken packages.

Is there any easy way to get this dataset downloaded.

Upvotes: 1

Views: 2717

Answers (2)

AruniRC
AruniRC

Reputation: 5148

You need to register on the Yahoo Webscope website and add this dataset to the "Cart". After submitting your request for the dataset, you should receive an email with instructions. I am reproducing a part of this email, after scrubbing out some of the details and privileged information.

  1. Download and install s3cmd from http://s3tools.org/download (or using an appropriate package manager for your platform)
  2. Run 's3cmd --configure' and enter your access key and secret ( available via XXXXXXXX <-- the actual link will be in their email ). Here you can also specify additional options, such as enabling encryption during transfer, and enabling a proxy.
  3. Run 's3cmd ls s3://yahoo-webscope/XXXXXXX/' to view the S3 objects for I3 - Yahoo Flickr Creative Commons 100M (14G) (Hosted on AWS)
  4. Run 's3cmd get --recursive s3://yahoo-webscope/XXXXXXX/' to download a local copy of I3 - Yahoo Flickr Creative Commons 100M (14G) (Hosted on AWS)

It should be easy for you to follow these steps and get the dataset. I agree, the steps are not very transparent in their website!

Upvotes: 0

Imma
Imma

Reputation: 122

This assumes that you already have pip and either Python 2.6.5+ or Python 3.3+ installed on your system. If you want to install awscli, you'll need to run

pip install awscli --upgrade --user

You can read more about installing the AWS Command Line Interface (CLI) here.

In addition, i think this link would let you gain access to the dataset that you are looking for.

Upvotes: 1

Related Questions