Navid Jahromi
Navid Jahromi

Reputation: 1

How to convert food-101 dataset into usable format for AWS SageMaker

I'm still very new to the world of machine learning and am looking for some guidance for how to continue a project that I've been working on. Right now I'm trying to feed in the Food-101 dataset into the Image Classification algorithm in SageMaker and later deploy this trained model onto an AWS deeplens to have food detection capabilities. Unfortunately the dataset comes with only the raw image files organized in sub folders as well as a .h5 file (not sure if I can just directly feed this file type into sageMaker?). From what I've gathered neither of these are suitable ways to feed in this dataset into SageMaker and I was wondering if anyone could help point me in the right direction of how I might be able to prepare the dataset properly for SageMaker i.e convert to a .rec or something else. Apologies if the scope of this question is very broad I am still a beginner to all of this and I'm simply stuck and do not know how to proceed so any help you guys might be able to provide would be fantastic. Thanks!

Upvotes: 0

Views: 337

Answers (1)

Julien Simon
Julien Simon

Reputation: 2729

if you want to use the built-in algo for image classification, you can either use Image format or RecordIO format, re: https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html#IC-inputoutput

Image format is straightforward: just build a manifest file with the list of images. This could be an easy solution for you, since you already have images organized in folders.

RecordIO requires that you build files with the 'im2rec' tool, re: https://mxnet.incubator.apache.org/versions/master/faq/recordio.html.

Once your data set is ready, you should be able to adapt the sample notebooks available at https://github.com/awslabs/amazon-sagemaker-examples/tree/master/introduction_to_amazon_algorithms

Upvotes: 1

Related Questions