Reputation: 103
I am trying to create a binary classifier on a data set of 10,000. I have tried multiple Activators and Optimizers, however the results are always between 56.8% and 58.9%. Given the fairly steady results over many dozen iterations, I assume the problem is either:
This is the data set: training-set.csv
I may be able to get 2000 more records but that would be it.
My question is: is there something in the way my model is constructed that is preventing it from learning to a higher degree?
Note that I am happy to have as many layers and nodes as needed, and time is not a factor in generating the model.
dataframe = pandas.read_csv(r"training-set.csv", index_col=None)
dataset = dataframe.values
X = dataset[:,0:48].astype(float)
Y = dataset[:,48]
#count the input variables
col_count = X.shape[1]
#normalize X
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X_scale = sc_X.fit_transform(X)
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_scale, Y, test_size = 0.2)
# define baseline model
activator = 'linear' #'relu' 'sigmoid' 'softmax' 'exponential' 'linear' 'tanh'
#opt = 'Adadelta' #adam SGD nadam RMSprop Adadelta
nodes = 1000
max_layers = 2
max_epochs = 100
max_batch = 32
loss_funct = 'binary_crossentropy' #for binary
last_act = 'sigmoid' # 'softmax' 'sigmoid' 'relu'
def baseline_model():
# create model
model = Sequential()
model.add(Dense(nodes, input_dim=col_count, activation=activator))
for x in range(0, max_layers):
model.add(Dropout(0.2))
model.add(Dense(nodes, input_dim=nodes, activation=activator))
#model.add(BatchNormalization())
model.add(Dense(1, activation=last_act)) #model.add(Dense(1, activation=last_act))
# Compile model
adam = Adam(lr=0.001)
model.compile(loss=loss_funct, optimizer=adam, metrics=['accuracy'])
return model
estimator = KerasClassifier(build_fn=baseline_model, epochs=max_epochs, batch_size=max_batch)
estimator.fit(X_train, y_train)
y_pred = estimator.predict(X_test)
#confusion matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
score = np.sum(cm.diagonal())/float(np.sum(cm))
Upvotes: 0
Views: 302
Reputation: 60380
Two points:
There is absolutely no point in stacking dense layers with linear activations - they only result to a single linear unit; change to activator = 'relu'
(and just don't bother with the other candidate activation functions in your commented-out list).
Do not use dropout by default, especially if your model has difficulties in learning (like here); remove the dropout layer(s), and just be ready to put (some of) them back in only in case you see overfitting (you are currently still very far from that point, so this is not something to worry about now).
Upvotes: 1