How to Design the Neural Network?

Question

I was trying to make a deep learning prediction model for predicting whether a person is a CKD patient or not. Can you please tell me? How can I design a neural network for it? How many neurons should I add in each layer? Or is there any other method in Keras to do so? The dataset link: https://github.com/Samar-080301/Python_Project/blob/master/ckd_full.csv

import tensorflow as tf
from tensorflow import keras
import pandas as pd
from sklearn.model_selection import train_test_split
import os
from matplotlib import pyplot as plt
os.chdir(r'C:\Users\samar\OneDrive\desktop\projects\Chronic_Kidney_Disease')
os.getcwd()
x=pd.read_csv('ckd_full.csv')
y=x[['class']]
y['class']=y['class'].replace(to_replace=(r'ckd',r'notckd'), value=(1,0))
x=x.drop(columns=['class'])
x['rbc']=x['rbc'].replace(to_replace=(r'normal',r'abnormal'), value=(1,0))
x['pcc']=x['pcc'].replace(to_replace=(r'present',r'notpresent'), value=(1,0))
x['ba']=x['ba'].replace(to_replace=(r'present',r'notpresent'), value=(1,0))
x['pc']=x['pc'].replace(to_replace=(r'normal',r'abnormal'), value=(1,0))
x['htn']=x['htn'].replace(to_replace=(r'yes',r'no'), value=(1,0))
x['dm']=x['dm'].replace(to_replace=(r'yes',r'no'), value=(1,0))
x['cad']=x['cad'].replace(to_replace=(r'yes',r'no'), value=(1,0))
x['pe']=x['pe'].replace(to_replace=(r'yes',r'no'), value=(1,0))
x['ane']=x['ane'].replace(to_replace=(r'yes',r'no'), value=(1,0))
x['appet']=x['appet'].replace(to_replace=(r'good',r'poor'), value=(1,0))
x[x=="?"]=np.nan
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.01)
#begin the model
model=keras.models.Sequential()
model.add(keras.layers.Dense(128,input_dim = 24, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu)) # adding a layer with 128 nodes and relu activaation function
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu)) # adding a layer with 128 nodes and relu activaation function
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu)) # adding a layer with 128 nodes and relu activaation function
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu)) # adding a layer with 128 nodes and relu activaation function 
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu)) # adding a layer with 128 nodes and relu activaation function
model.add(tf.keras.layers.Dense(2,activation=tf.nn.softmax)) # adding a layer with 2 nodes and softmax activaation function
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) # specifiying hyperparameters
model.fit(xtrain,ytrain,epochs=5) # load the model
model.save('Nephrologist') # save the model with a unique name
myModel=tf.keras.models.load_model('Nephrologist')  # make an object of the model
prediction=myModel.predict((xtest))



     C:\Users\samar\anaconda3\lib\site-packages\ipykernel_launcher.py:12: SettingWithCopyWarning: 
    A value is trying to be set on a copy of a slice from a DataFrame.
    Try using .loc[row_indexer,col_indexer] = value instead
    
    See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
      if sys.path[0] == '':
Epoch 1/5
396/396 [==============================] - 0s 969us/sample - loss: nan - acc: 0.3561
Epoch 2/5
396/396 [==============================] - 0s 343us/sample - loss: nan - acc: 0.3763
Epoch 3/5
396/396 [==============================] - 0s 323us/sample - loss: nan - acc: 0.3763
Epoch 4/5
396/396 [==============================] - 0s 283us/sample - loss: nan - acc: 0.3763
Epoch 5/5
396/396 [==============================] - 0s 303us/sample - loss: nan - acc: 0.3763

My Koryto · Accepted Answer

Here is the structure that I achieved 100% test accuracy with:

model=keras.models.Sequential()
model.add(keras.layers.Dense(200,input_dim = 24, activation=tf.nn.tanh))
model.add(keras.layers.Dense(1, activation=tf.nn.sigmoid))

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) # specifiying hyperparameters

xtrain_tensor = tf.convert_to_tensor(xtrain, dtype=tf.float32)
ytrain_tensor = tf.convert_to_tensor(ytrain, dtype=tf.float32)
model.fit(xtrain_tensor , ytrain_tensor , epochs=500, batch_size=128, validation_split = 0.15, shuffle=True, verbose=2) # load the model
results = model.evaluate(xtest, ytest, batch_size=128)

Output:

3/3 - 0s - loss: 0.2560 - accuracy: 0.9412 - val_loss: 0.2227 - val_accuracy: 0.9815
Epoch 500/500
3/3 - 0s - loss: 0.2225 - accuracy: 0.9673 - val_loss: 0.2224 - val_accuracy: 0.9815
1/1 [==============================] - 0s 0s/step - loss: 0.1871 - accuracy: 1.0000

The last line represents the evaluation of the model on the test dataset. Seems like it generalized well :)

------------------------------------------------- Original answer below --------------------------------------------------- I would go with a logistic regression model first in order to see if there is any predictive value to your dataset.

model=keras.models.Sequential()
model.add(keras.layers.Dense(1,input_dim = 24, activation=tf.nn.sigmoid))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # specifiying hyperparameters
model.fit(xtrain,ytrain,epochs=100) # Might require more or less epoches. It depends on the amount of noise in your dataset.

If you see you receive an accuracy score that satisfies you, I would give it a try and add 1 or 2 more dense hidden layers with between 10 to 40 nodes. It's important to mention that my advice is solely based on my experience.

I HIGHLY(!!!!) recommend transforming the y_label into a binary value when 1 represents the positive class (a record is a record of a CKD patient) and 0 represents the negative class. Let me know if it works, and if it doesn't I'll also try to play with your dataset.

How to Design the Neural Network?

Answers (2)

Related Questions