Reputation: 518
I have a neural network using keras with a tensorflow backend:
seed = 7
np.random.seed(seed)
model = Sequential()
model.add(Dense(32, input_dim=11, init='uniform', activation='relu'))
model.add(Dense(12, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, result_train, nb_epoch=50, batch_size=5)
scores = model.evaluate(X_test, result_test)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
I am testing drop-outs from a public colleges with their socio-economic parameters as variables, initially I have 8 csv files (named a,b,c,d,e,f,g and h) with 12 column headers and 300,000 rows. The result is binary, 0 for retained and 1 for dropped,I normalized the data before feeding it to the NN.
My first training set was a,b,c,d,e and f, with g and h as hold out for testing. the Neural networks provided me with a good specificity, sensitivity and accuracy of :70%, 65% and 66%.
With that I trained another NN of the same architecture as stated above this time my training datasets are c,d,e,f, g and h with a and b as my new hold-out for testing, but then the model provides a very bad result for specificity, sensitivity and accuracy: 42%, 48% and 47%, I am wondering why? Are there any published papers citing this kind of phenomenon in neural networks?
thanks!
Upvotes: 0
Views: 1098
Reputation: 19252
Many machine learning methods can suffer from a problem known as over-fitting. Wikipedia gives a variety of refernces to this.
The reason you at least use a hold-out data set is to test how well your trained model fits unseen data. In theory you could be 100% accurate on one data set and yet perform very badly on new data.
Some people use cross-validation rather than just one or tow held back data sets - this will try each data point in a test and a training set. For example with 10 data points, use 9 to train and try to fit the tenth one. Then do this for each permutation.
This can be appropriate if the various patterns are not evenly distribtued in a data set.
If one of your training sets has all drops outs, then a model predicting everyone drops out will fit this best, but will not generalise to any data with no drop outs.
It is often worth doing some exploitary data analysis to see if some of your data sets are not representative.
Upvotes: 1