unknown problem(S) in a keras neural network for multi-label regression

Question

I am new to neural network and keras, trying to make sure that things work before putting with my real data.

so here a neural network with 1000 sample, three inputs and three outputs

X.csv contains :(the index repeated three times)

1,1,1

2,2,2

until 1000,1000,1000

Y.csv contains three lables:(the index,the index*5,he index/5)

1,5,0.2

2,10,0.4

until 1000,5000,200

random.seed(42)
X = np.genfromtxt(r'C:\Users\boss\Desktop\X.csv' , delimiter=',')
y = np.genfromtxt(r'C:\Users\boss\Desktop\Y.csv' , delimiter=',')
y1,y2,y3 = y[:, 0:1],y[:, 1:2],y[:, 2:]
X_train, X_test, y1_train, y1_test, y2_train, y2_test, y3_train, y3_test = train_test_split(X, y1,y2,y3, test_size =0.3, random_state = 0)
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

inp = Input((3,)) 
x = Dense(10, activation='relu')(inp)
x = Dense(10, activation='relu')(x)
x = Dense(10, activation='relu')(x)
out1 = Dense(1,  activation='linear')(x)
out2 = Dense(1,  activation='linear')(x)
out3 = Dense(1,  activation='linear')(x)

model = Model(inputs=inp, outputs=[out1,out2,out3])
model.compile(optimizer = "adam", loss = 'mse')
model.fit(x=X_train, y=[y1_train,y2_train,y3_train], batch_size=100, epochs=10, verbose=1, validation_split=0.3,  shuffle=True)            

#plot predicted data vs real data
y_pred = model.predict(X_test)
plt.plot(y1_test, color = 'red', label = 'Real data')
plt.plot(y_pred[0], color = 'blue', label = 'Predicted data')
plt.title('y1')
plt.legend()
plt.show()

plt.plot(y2_test, color = 'red', label = 'Real data')
plt.plot(y_pred[1], color = 'blue', label = 'Predicted data')
plt.title('y2')
plt.legend()
plt.show()

plt.plot(y3_test, color = 'red', label = 'Real data')
plt.plot(y_pred[2], color = 'blue', label = 'Predicted data')
plt.title('y3')
plt.legend()
plt.show()

unfortunately both both loss and validation loss are huge(millions) another problem is that results differs each time despite using random seed

lehiester · Accepted Answer

One likely cause of the high loss is the small number of epochs--you will rarely get good results with only 10. Try 100, 1000, etc. to see how the results improve.

For reproducible random number generation, you also need to specify seeds for Numpy and TensorFlow (if you're using the TensorFlow backend, which is the default). Here's an example given by this article:

from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)

unknown problem(S) in a keras neural network for multi-label regression

Answers (1)

Related Questions