Reputation: 17

ANN problem: I am building an ANN model to predict the profit of a new startup based on certain features

import numpy as np
from keras.models import Sequential
from keras.layers import Dense

Loading the data set using pandas as data frame format

import pandas as pd
df = pd.read_csv(r"E:\50_Startups.csv")
df.drop(['State'],axis = 1, inplace = True)

from sklearn.preprocessing import MinMaxScaler
mm = MinMaxScaler()
df.iloc[:,:] = mm.fit_transform(df.iloc[:,:])
info = df.describe()

x = df.iloc[:,:-1].values
y = df.iloc[:,-1].values

from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split( x,y, test_size=0.2, random_state=42)

Initializing the model

model = Sequential()
model.add(Dense(40,input_dim =3,activation="relu",kernel_initializer='he_normal'))
model.add(Dense(30,activation="relu"))
model.add(Dense(1))
model.compile(loss="mean_squared_error",optimizer="adam",metrics=["accuracy"])

fitting model on train data

model.fit(x=x_train,y=y_train,epochs=150, batch_size=32,verbose=1)

Evaluating the model on test data

eval_score_test = model.evaluate(x_test,y_test,verbose = 1)

I am getting zero accuracy.

Upvotes: 0

Answers (3)

Nickman

Reputation: 17

Just want to say Thank you to everyone who took their precious time to help me. I am posting this code as this worked for me. I hope it helps everyone who is stuck somewhere looking for answers. I got this code after consulting with my friend.

import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
import pandas as pd
from sklearn.model_selection import train_test_split

# Loading the data set using pandas as data frame format 
startups = pd.read_csv(r"E:\0Assignments\DL_assign\50_Startups.csv")
startups = startups.drop("State", axis =1)

train, test = train_test_split(startups, test_size = 0.2)

x_train = train.iloc[:,0:3].values.astype("float32")
x_test = test.iloc[:,0:3].values.astype("float32")
y_train = train.Profit.values.astype("float32")
y_test = test.Profit.values.astype("float32")

def norm_func(i):
     x = ((i-i.min())/(i.max()-i.min()))
     return (x)

x_train = norm_func(x_train)
x_test = norm_func(x_test)
y_train = norm_func(y_train)
y_test = norm_func(y_test)

# one hot encoding outputs for both train and test data sets 
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)

# Storing the number of classes into the variable num_of_classes 
num_of_classes = y_test.shape[1]
x_train.shape
y_train.shape
x_test.shape
y_test.shape

# Creating a user defined function to return the model for which we are
# giving the input to train the ANN mode
def design_mlp():
    # Initializing the model 
    model = Sequential()
    model.add(Dense(500,input_dim =3,activation="relu"))
    model.add(Dense(200,activation="tanh"))
    model.add(Dense(100,activation="tanh"))
    model.add(Dense(50,activation="tanh"))
    model.add(Dense(num_of_classes,activation="linear"))
    model.compile(loss="mean_squared_error",optimizer="adam",metrics = 
    ["accuracy"])
    return model

# building a cnn model using train data set and validating on test data set
model = design_mlp()

# fitting model on train data
model.fit(x=x_train,y=y_train,batch_size=100,epochs=10)

# Evaluating the model on test data  
eval_score_test = model.evaluate(x_test,y_test,verbose = 1)
print ("Accuracy: %.3f%%" %(eval_score_test[1]*100)) 

# accuracy score on train data 
eval_score_train = model.evaluate(x_train,y_train,verbose=0)
print ("Accuracy: %.3f%%" %(eval_score_train[1]*100))

Upvotes: 0

Abhishek Prajapat

Reputation: 1888

Adding to the answer of @GuintherKovalski accuracy is not for regression but if you still want to use it then you can use it along with some threshold using following steps:

Set a threshold such that if the absolute difference in the predicted value and the actual value is less than equal to the threshold then you consider that value as correct, otherwise false.
Ex -> predicted values = [0.3, 0.7, 0.8, 0.2], original values = [0.2, 0.8, 0.5, 0.4]. Now abs diff -> [0.1, 0.1, 0.3, 0.2] and let's take a threshold of 0.2. So with this threshold the correct -> [1, 1, 0, 1] and your accuracy will be correct.sum()/len(correct) and that is 3/4 -> 0.75.

This could be implemented in TensorFlow like this

import numpy as np
import tensorflow as tf
from sklearn.datasets import make_regression

data = make_regression(10000)

model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(100,))])

def custom_metric(a, b):
    threshold = 1 # Choose accordingly
    abs_diff = tf.abs(b - a)
    correct = abs_diff >= threshold
    correct = tf.cast(correct, dtype=tf.float16)
    res = tf.math.reduce_mean(correct)
    return res

model.compile('adam', 'mae', metrics=[custom_metric])
model.fit(data[0], data[1], epochs=30, batch_size=32)

Upvotes: 0

Guinther Kovalski

Reputation: 1929

The problem is that accuracy is a metric for discrete values (classification).

you should use:

r2 score mape smape

instead.

e.g:

model.compile(loss="mean_squared_error",optimizer="adam",metrics=["mean_absolute_percentage_error"])

Upvotes: 0

ANN problem: I am building an ANN model to predict the profit of a new startup based on certain features

Loading the data set using pandas as data frame format

Initializing the model

fitting model on train data

Evaluating the model on test data

Answers (3)

Related Questions