I just trained my first ML model based on the titanic dataset from kaggle.I am getting an RMSE value of ~0.4 is it good?

Question

Please Note : I trained my model only on the basis of numerical columns and not the string columns

And please suggest some resources to go further into machine learning as I really like this subject.

Thank you

Here is the code and gives the following output :-

train rmse: 0.42 test rmse: 0.43

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import pandas as pd
import matplotlib.pyplot as plt

dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
dftest = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv')

dftrain.loc[dftrain['fare'] == 0, 'fare'] = 34.85
plt.plot(list(dftrain.age), list(dftrain.fare), '.',markersize = 1)

dftrain = dftrain.drop(['sex', 'class', 'deck','embark_town', 'alone'], axis =1 )
X = dftrain.loc[:, dftrain.columns != 'survived']
y = dftrain.loc[:, 'survived']

model = Sequential()
model.add(Dense(128, activation = 'relu', input_dim = 4))
model.add(Dense(64, activation = 'relu'))
model.add(Dense(32, activation = 'relu'))
model.add(Dense(1, activation = 'sigmoid'))
model.compile(optimizer = 'adam' , loss = 'binary_crossentropy', metrics = ['accuracy'])
model.fit(X, y , epochs = 200)

dftest = dftest.drop(['sex', 'class', 'deck','embark_town', 'alone'], axis =1 )
A = dftest.loc[:, dftest.columns != 'survived']
b = dftest.loc[:, 'survived']

from sklearn.metrics import mean_squared_error
import numpy as np

train_pred = model.predict(X)
train_rmse = np.sqrt(mean_squared_error(y, train_pred))
test_pred = model.predict(A)
test_rmse = np.sqrt(mean_squared_error(b, test_pred))

print("train rmse: {:0.2f}".format(train_rmse))
print("test rmse: {:0.2f}".format(test_rmse))```

I just trained my first ML model based on the titanic dataset from kaggle.I am getting an RMSE value of ~0.4 is it good?

Answers (1)

Related Questions