Aditya Sivadas
Aditya Sivadas

Reputation: 1

Why am I getting a constant loss and accuracy?

This is my code:-

# Importing the essential libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Getting the dataset
data = pd.read_csv("sales_train.csv")
X = data.iloc[:, 1:-1].values 
y = data.iloc[:, -1].values
# y = np.array(y).reshape(-1, 1)

# Getting the values for november 2013 and 2014 to predict 2015
list_of_november_values = []
list_of_november_values_y = []
for i in range(0, len(y)):
    if X[i, 0] == 10 or X[i, 0] == 22:
        list_of_november_values.append(X[i, 1:])
        list_of_november_values_y.append(y[i])

# Converting list to array
arr_of_november_values = np.array(list_of_november_values)
y_train = np.array(list_of_november_values_y).reshape(-1, 1)

# Scaling the independent values 
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(arr_of_november_values)

# Creating the neural network
from keras.models import Sequential
from keras.layers import Dense

nn = Sequential()
nn.add(Dense(units=120, activation='relu'))
nn.add(Dense(units=60, activation='relu'))
nn.add(Dense(units=30, activation='relu'))
nn.add(Dense(units=15, activation='relu'))
nn.add(Dense(units=1, activation='softmax'))
nn.compile(optimizer='adam', loss='mse')
nn.fit(X_train, y_train, batch_size=100, epochs=25)

# Saving the weights
nn.save_weights('weights.h5')
print("Weights Saved")

For my loss, I am getting the same value for every epoch. Is it possible if there is a concept I am missing that is causing my loss to be constant??

Here is the dataset for the code.

Upvotes: 0

Views: 58

Answers (2)

Nicolas Gervais
Nicolas Gervais

Reputation: 36584

Change this line:

nn.add(Dense(units=1, activation='softmax'))

To this line:

nn.add(Dense(units=1))

For a regression problem, you don't need an activation function.

Upvotes: 0

user192361237
user192361237

Reputation: 538

The predominant reason is your odd choice of final-layer activation, paired with the loss function used. Reconsider this: you are using softmax activation on a single-unit fully-connected layer. Softmax activation takes a vector and scales it such that the sum of the values are equal to one and it retains proportion according to the following function:

softmax

The idea is that your network will only ever output 1, thus there are no gradients, and no learning.

To resolve this, first change your final layer activation to either ReLU or Linear, depending upon the structure of your dataset (I'm not willing to use the provided data myself, but I'm sure you understand the structure of your dataset).

I expect there may be further issues regarding the structure of your network, but I'll leave that up to you. For now, the big issue is your final-layer activation.

Upvotes: 1

Related Questions