Reputation: 13
I am trying to train a Neural Network(NN) implemented through Keras to implement the following function.
y(n) = y(n-1)*0.9 + x(n)*0.1
So the idea is to have a signal as train_x data and pass through the above function to get a train_y data, giving us a (train_x, train_y) training data.
import numpy as np
from keras.models import Sequential
from keras.layers.core import Activation, Dense
from keras.callbacks import EarlyStopping
import matplotlib.pyplot as plt
train_x = np.concatenate((np.ones(100)*120,np.ones(150)*150,np.ones(150)*90,np.ones(100)*110), axis=None)
train_y = np.ones(train_x.size)*train_x[0]
alpha = 0.9
for i in range(train_x.size):
train_y[i] = train_y[i-1]*alpha + train_x[i]*(1 - alpha)
train_x data vs train_y data plot
The function under question y(n) is a low pass function and makes the x(n) value to not change abruptly, as shown in the plot.
Then I make a NN and fit it with (train_x, train_y) and plot the
model = Sequential()
model.add(Dense(128, kernel_initializer='normal', input_dim=1, activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='linear'))
model.compile(loss='mean_absolute_error',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(train_x, train_y, epochs=200, verbose=0)
print(history.history['loss'][-1])
plt.plot(history.history['loss'])
plt.show()
And the final loss value is approximately 2.9, which I thought was pretty good. But then the accuracy plot was like this
So when I check the prediction of the neural network over the data it was trained on
plt.plot(model.predict(train_x))
plt.plot(train_x)
plt.show()
The values have just offsetted by a little and that's all. I tried changing the activation functions, number of neurons and layers but the result still is the same. What am I doing wrong?
---- Edit ----
Made the NN to accept 2 dimensional input and it works as intended
import numpy as np
from keras.models import Sequential
from keras.layers.core import Activation, Dense
from keras.callbacks import EarlyStopping
import matplotlib.pyplot as plt
train_x = np.concatenate((np.ones(100)*120,np.ones(150)*150,np.ones(150)*90,np.ones(100)*110), axis=None)
train_y = np.ones(train_x.size)*train_x[0]
alpha = 0.9
for i in range(train_x.size):
train_y[i] = train_y[i-1]*alpha + train_x[i]*(1 - alpha)
train = np.empty((500,2))
for i in range(500):
train[i][0]=train_x[i]
train[i][1]=train_y[i]
model = Sequential()
model.add(Dense(128, kernel_initializer='normal', input_dim=2, activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(256, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation='linear'))
model.compile(loss='mean_absolute_error',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(train, train_y, epochs=100, verbose=0)
print(history.history['loss'][-1])
plt.plot(history.history['loss'])
plt.show()
Upvotes: 1
Views: 99
Reputation: 44043
The neural network you're building is a simple multi-layer perceptron with one input node and one output node. This means that it is essentially a function that accepts one real number and returns one real number -- context is not passed in and can therefore not be considered. The expression
model.predict(train_x)
Does not evaluate a vector-to-vector function for the vector train_x
but evaluates a number-to-number function for every number in train_x
, then returns the list of results. This is why you get flat segments in the train_x_vs_predict_x plot: the same input numbers produce the same output numbers every time.
Given this constraint, the approximation is actually quite good. For example, the network has seen for x values of 150 many y values of 150 and a few lower ones but never anything above 150. So, given an x of 150, it predicts a y value of slightly lower than 150.
The function you wanted, on the other hand, refers to the previous function value and will need information about this in its input. If what you're trying to build is a function that accepts a sequence of real numbers and returns a sequence of real numbers, you could do that with a many-to-many recurrent network (and you're going to need a lot more training data than one example sequence), but since you can calculate the function directly, why bother with neural networks at all? There's no need to whip out the chainsaw where a butter knife will do.
Upvotes: 1
Reputation: 4521
If I execute your code, I get the following plot for the X-Y values:
If I didn't miss something important here and you really feed that to your neural net, you probably can't expect better results. The reason is, that a neural net is just a function that can only calculate one output vector for one input. In your case the output vector would consist of only one element (your y value), but as you can see in the diagram above, for x=90 there is not just one single output. So what you feed to your neural net, cannot really be calculated as a function and so most likely the network tries to calculate the straight line between point ~(90, 145) and ~(150, 150). I mean, the "upper line" in the diagram.
Upvotes: 2