Reputation: 4419
I have the following code that gets a set of images, around 50 in each training set, and then creates a linear model and tries to classify the data. I have a testing set also but it can't even classify the training data with any kind of accuracy. Is there some error in the way that I'm loading the images in? I'd be glad to provide more code or my output if it would be helpful.
def create_image_list(file_path):
image_list = []
for filename in glob.glob(file_path):
img = Image.open(filename)
img_resized = img.resize((32, 32), Image.ANTIALIAS)
pix = img.load()
pixlist = []
for x in range(0, 32):
for y in range(0,32):
pixlist.append(pix[x,y][0])
pixlist.append(pix[x,y][1])
pixlist.append(pix[x,y][2])
image_list.append(pixlist)
return image_list
dalmation_training = create_image_list('/images/dalmatian/training/*')
dollabill_training = create_image_list('/images/dollar_bill/training/*')
pizza_training = create_image_list('/images/pizza/training/*')
soccer_ball_training = create_image_list('/images/soccer_ball/training/*')
sunflower_training = create_image_list('/images/sunflower/training/*')
c = '1e2'
testing_set = dalmation_training + dollabill_training + pizza_training + soccer_ball_training + sunflower_training
dalmation_y = [1]*len(dalmation_training ) + [-1]*len(dollabill_training) + [-1]*len(pizza_training) + [-1]*len(soccer_ball_training) + [-1]*len(sunflower_training)
dalmation_model_linear = svm_train(dalmation_y, testing_set, '-t 0 -c %s -b 1 -q' % c)
dollabill_y = [-1]*len(dalmation_training ) + [1]*len(dollabill_training) + [-1]*len(pizza_training) + [-1]*len(soccer_ball_training) + [-1]*len(sunflower_training)
dollabill_model_linear = svm_train(dollabill_y, testing_set, "-t 0 -c %s -b 1 -q" % c)
pizza_y = [-1]*len(dalmation_training ) + [-1]*len(dollabill_training) + [1]*len(pizza_training) + [-1]*len(soccer_ball_training) + [-1]*len(sunflower_training)
pizza_model_linear = svm_train(pizza_y, testing_set, "-t 0 -c %s -b 1 -q" % c)
soccer_ball_y = [-1]*len(dalmation_training ) + [-1]*len(dollabill_training) + [-1]*len(pizza_training) + [1]*len(soccer_ball_training) + [-1]*len(sunflower_training)
soccer_ball_model_linear = svm_train(soccer_ball_y, testing_set, "-t 0 -c %s -b 1 -q" % c)
sunflower_y = [-1]*len(dalmation_training) + [-1]*len(dollabill_training) + [-1]*len(pizza_training) + [-1]*len(soccer_ball_training) + [1]*len(sunflower_training)
sunflower_model_linear = svm_train(sunflower_y, testing_set, "-t 0 -c %s -b 1 -q" % c)
print 'dalmation linear'
result1, something, p1 = svm_predict([1]*len(testing_set), testing_set, dalmation_model_linear, "-b 1")
print 'dollabill linear'
result2, something, p2 = svm_predict([1]*len(testing_set), testing_set, dollabill_model_linear, "-b 1")
print 'pizza linear'
result3, something, p3 = svm_predict([1]*len(testing_set), testing_set, pizza_model_linear, "-b 1")
print 'soccer linear'
result4, something, p4 = svm_predict([1]*len(testing_set), testing_set, soccer_ball_model_linear, "-b 1")
print 'sunflower linear'
result5, something, p5 = svm_predict([1]*len(testing_set), testing_set, sunflower_model_linear, "-b 1")
When I run this and run a few accuracy measurements it is around 20% everytime with that last dataset, the sunflowers being near 100% accuracy and the others near 5%. I believe that I am putting it in the correct format for libsvm and I can't find any clues. I have tried may different values of c from 1e-8 to 1e8 and it changed the accuracy slightly not more than 5% for each one.
Any input would be greatly appreciated and I'd be glad to give more info!
Upvotes: 0
Views: 279
Reputation: 2502
testing_set
list to svm_predict
and for true labels you pass [1]*len(testing_set)
which is not correct. For dalmation model, the true class value should be dalmation_y
calculated earlier. Upvotes: 2