Reputation: 383
I'm using Keras and I'm trying to build a Neural Network to predict the interest rate of given data. The data looks like this:
loan_amnt annual_inc emp_length int_rate
10000 38000.0 5.600882 12.40
13750 17808.0 5.600882 28.80
26100 68000.0 10.000000 20.00
13000 30000.0 1.000000 20.00
7000 79950.0 7.000000 7.02
The features (X) are loan_amnt
, annual_inc
, and emp_length
. The target (y) is int_rate
.
Here's my process and what I've done after normalizing the data:
#Building out model
model = Sequential([
Dense(9, activation='relu', input_shape=(3,)),
Dense(3, activation='relu'),
Dense(1, activation='linear'),
])
#Compiling model
model.compile(loss='mean_absolute_percentage_error',
metrics=['mse'],
optimizer='RMSprop')
hist = model.fit(X_train, Y_train,
batch_size=100, epochs=20, verbose=1)
Here's an output sample after running model.fit()
:
Epoch 1/20
693/693 [==============================] - 1s 905us/step - loss: 96.2391 - mean_squared_error:
179.8007
Epoch 2/20
693/693 [==============================] - 0s 21us/step - loss: 95.2362 - mean_squared_error:
176.9865
Epoch 3/20
693/693 [==============================] - 0s 20us/step - loss: 94.4133 - mean_squared_error:
174.6367
Finally, evaluating the model model.evaluate(X_train, Y_train)
and got the following output:
693/693 [==============================] - 0s 372us/step
[77.88501817667468, 132.0109032635049]
The question is, how can I know if my model is doing well or not, and how can I read the numbers?
Upvotes: 1
Views: 61
Reputation: 3648
You should not check the accuracy of your model using the training data because it makes your solution prone to overfitting. Instead you should set some data aside (20% is what I usually use) to validate your results.
If you plan on doing a lot of testing you should set aside a third dataset only for testing the final solution.
You can also use k_folds cross validation where you train the set on part of the data and use the rest to evaluate it, but doing so multiple times to get a better understanding of how accurate your model is.
Upvotes: 1
Reputation: 1290
You are using a variant of the MSE
loss which is defined as :
MSE = mean((y_true - y_pred)^2)
So when you have 132.
as a MSE metrics, then you really have a mean of sqrt(132.)
~= 11,5 mean difference between the y_true and y_pred. Which is quite a bit on your data as it is shown on the MSPE
loss, you're having ~78% error on your data.
In example if the y_true was 20, you could either predict 36 or 4. Something like that.
You could say that your error is good when MSPE is at 10%. Depends on your case
Upvotes: 1