Reputation: 1
I'm still very new to the world of machine learning and am looking for some guidance for how to continue a project that I've been working on. Right now I'm trying to feed in the Food-101 dataset into the Image Classification algorithm in SageMaker and later deploy this trained model onto an AWS deeplens to have food detection capabilities. Unfortunately the dataset comes with only the raw image files organized in sub folders as well as a .h5 file (not sure if I can just directly feed this file type into sageMaker?). From what I've gathered neither of these are suitable ways to feed in this dataset into SageMaker and I was wondering if anyone could help point me in the right direction of how I might be able to prepare the dataset properly for SageMaker i.e convert to a .rec or something else. Apologies if the scope of this question is very broad I am still a beginner to all of this and I'm simply stuck and do not know how to proceed so any help you guys might be able to provide would be fantastic. Thanks!
Upvotes: 0
Views: 337
Reputation: 2729
if you want to use the built-in algo for image classification, you can either use Image format or RecordIO format, re: https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html#IC-inputoutput
Image format is straightforward: just build a manifest file with the list of images. This could be an easy solution for you, since you already have images organized in folders.
RecordIO requires that you build files with the 'im2rec' tool, re: https://mxnet.incubator.apache.org/versions/master/faq/recordio.html.
Once your data set is ready, you should be able to adapt the sample notebooks available at https://github.com/awslabs/amazon-sagemaker-examples/tree/master/introduction_to_amazon_algorithms
Upvotes: 1