zar3bski
zar3bski

Reputation: 3171

Keras poor performance (Loss and optimization functions?)

I've spend the last 2 weeks struggling with my NN. The aim is to predict trip durations of taxi courses based on several

Here is the simplest version

X_train = trainData.as_matrix(columns=["fareDistance","hour","day","pickup_longitude","pickup_latitude","dropoff_longitude","dropoff_latitude"])    
Y_train = np.array(trainData["trip_duration"])
model = Sequential()
model.add(Dense(32, input_dim=7, activation='linear'))
model.add(Dense(12, activation='linear'))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_absolute_percentage_error', optimizer='adagrad', metrics=['accuracy'])
model.summary()
model.fit(X_train, Y_train, epochs=10, validation_split=0.2)

I also tried to merge two different models for numerical variables on one hand and categorical on the other but it didn't change a thing. Depending on the combinations of Loss and optimization function either the loss and accuracy remain quite the same (acc. 0.0016) or I don't even have non null acc.

A friend of mine replicated the NN in pure TensorFlow and got the same kind of results

Train on 233383 samples, validate on 58346 samples 
Epoch 1/20 233383/233383 [==============================] - 15s - loss: 45.9550 - acc: 0.0016 - val_loss: 46.2514 - val_acc: 0.0014
Epoch 2/20 233383/233383 [==============================] - 15s - loss: 45.8675 - acc: 0.0014 - val_loss: 46.2675 - val_acc: 0.0015
Epoch 3/20 233383/233383 [==============================] - 15s - loss: 45.8465 - acc: 0.0015 - val_loss: 46.2131 - val_acc: 0.0013
Epoch 4/20 233383/233383 [==============================] - 15s - loss: 45.8283 - acc: 0.0014 - val_loss: 46.2478 - val_acc: 0.0016
Epoch 5/20 233383/233383 [==============================] - 15s - loss: 45.8214 - acc: 0.0015 - val_loss: 46.2043 - val_acc: 0.0013
Epoch 6/20 233383/233383 [==============================] - 14s - loss: 45.8122 - acc: 0.0014 - val_loss: 46.2526 - val_acc: 0.0014
Epoch 7/20 233383/233383 [==============================] - 12s - loss: 45.7990 - acc: 0.0015 - val_loss: 46.1821 - val_acc: 0.0014
Epoch 8/20 233383/233383 [==============================] - 12s - loss: 45.7964 - acc: 0.0016 - val_loss: 46.1761 - val_acc: 0.0013
Epoch 9/20 233383/233383 [==============================] - 11s - loss: 45.7898 - acc: 0.0015 - val_loss: 46.1804 - val_acc: 0.0016

Am I missing something -- like something big, obvious -- which would explain why any attempt to change activation, loss or optimization function ends up doing the same?

Thanks in advance D.

Upvotes: 2

Views: 771

Answers (1)

Vadim
Vadim

Reputation: 4529

try this:

X_train = trainData.as_matrix(columns=["fareDistance","hour","day","pickup_longitude","pickup_latitude","dropoff_longitude","dropoff_latitude"])    
Y_train = np.array(trainData["trip_duration"])
model = Sequential()
model.add(Dense(32, input_dim=7, activation='elu'))
model.add(Dense(12, activation='elu'))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_absolute_percentage_error', optimizer='rmsprop')
model.summary()
model.fit(X_train, Y_train, epochs=10, validation_split=0.2)

you can also try the adam optimizer.

model.compile(loss='mean_absolute_percentage_error', optimizer='adam')

Update:

  • If the code above didn't help you it means your input data either not normalized or very dirty.

Upvotes: 2

Related Questions