M.S.
M.S.

Reputation: 43

Keras low accuracy classification task

I was playing around with Keras and a dummy dataset. I wanted to see how much better a neural network would do compared to a standard SVM with a RBF kernel. The task was simple: predict the class for a 20-dim vector in the set {0,1,2}.

I noticed that the neural network does horribly. The SVM gets around 90% correct whereas the neural network flounders at 40%. What am I doing wrong in my code? This is most likely an error on my part but after a few hours of trying various parameters on the NN, I've given up.

Code

from sklearn.datasets import make_multilabel_classification
from sklearn.svm import SVC

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD, RMSprop
from keras.utils import np_utils
from keras.datasets import mnist

# generate some data
dummyX, dummyY = make_multilabel_classification(n_samples=4000, n_features=20, n_classes=3)

# neural network
model = Sequential()
model.add(Dense(20, input_dim=20))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(20))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(3))
model.add(Activation('softmax'))

model.compile(loss='mean_squared_error',
          optimizer='sgd',
          metrics=['accuracy'])

X_train, X_test, y_train, y_test = train_test_split(dummyX, dummyY, test_size=0.20, random_state=42)

model.fit(X_train, y_train,nb_epoch=20, batch_size=30, validation_data=(X_test, y_test))
# Epoch 20/20
# 3200/3200 [==============================] - 0s - loss: 0.2469 - acc: 0.4366 - val_loss: 0.2468 - val_acc: 0.4063
# Out[460]:


# SVM - note that y_train and test are binary label. I haven't included the multi class converter code here for brevity
svm = SVC()
svm.fit(X_train, y_train)
svm.score(X_test, y_test)
# 0.891249

TL;DR

Made dummy data; neural network sucked; SVM kicked walloped it. Please help

Upvotes: 3

Views: 2213

Answers (1)

Stepan Novikov
Stepan Novikov

Reputation: 1396

After several tries of your example, I figured out that accuracy fluctuates between 0.3 and to 0.9 regardless of learning technique. I believe that this happening because of random and meaningless data – it’s hardly possible to recognize any features inside white noise

I suggest using meaningful datasets like MNIST to compare the accuracy of different approaches

About parameters. As it mentioned Matias, it’s better use categorical_crossentropy for this case. Also I suggest using adadelta optimizer if you don’t want to do manual tuning of parameters

Upvotes: 1

Related Questions