Reputation: 49
I'm going through the examples in Fundamentals of Machine Learning by Thomas P. Trappenberg. This book doesn't provide the promised actual jupyter notebook files, so I'm learning by copying and reading the example codes.
Here's an introductory example on analyzing sequential data found in Chapter 9. We want a simple network with a single hidden layer to predict the sine wave correctly.
# sine sequence
import numpy as np
import matplotlib.pyplot as plt
from keras import models, layers, optimizers, datasets, utils, losses
# sine data with 10 steps/cycle
seq = np.array([np.sin(2*np.pi*i/10) for i in range(10)])
print(seq)
num_seq = 200
x_train = np.array([])
y_train = np.array([])
for i in range(num_seq):
ran = np.random.randint(10)
x_train = np.append(x_train, seq[ran])
y_train = np.append(y_train, seq[np.mod(ran+1, 10)])
x_test = np.array(seq)
y_test = np.array(np.roll(seq, -1))
So far, I see that we are picking 200 points from the sine wave of the domain between 0 and 10. x_train contains the 200 values from the sine function, and y_train contains the next value in the sequence that we wish to predict.
The following code is supposed to predict the sine function using the knowledge of the previous two points in the sequence. This is where I find an error when I run the code.
# MLP2
inputs = layers.Input(shape = (2, ))
h = layers.Dense(2, activation = 'relu')(inputs)
outputs = layers.Dense(1, activation = 'tanh')(h)
model = models.Model(inputs, outputs)
model.compile(loss = 'mean_squared_error', optimizer = 'adam')
print(model.summary())
model.fit(x_train, y_train, epochs = 1000, batch_size = 100, verbose = 0)
# evaluate
y_pred = model.predict(x_test, batch_size = 10, verbose = 1)
plt.plot(y_test, 'x')
plt.plot(y_pred, 'o')
When I run this code, from the line
model.fit(x_train, y_train, epochs = 1000, batch_size = 100, verbose = 0)
I get that
ValueError: Error when checking input: expected input_11 to have shape (2,) but got array with shape (1,)
I kind of get it because both x_train and y_train have the shape (1, ) as they are understood as vectors. When I was copying the code
inputs = layers.Input(shape = (2, ))
I blindly thought 'maybe this is how Keras is written to understand the past two entries of the sequence' because this is my first time studying machine learning and I'm not familiar with Keras.
Do you see a mistake in the example code, a structural mistake or a human mistake made by me, perhaps?
(I have a direct follow-up question because the next example introduces RNN, and it starts with a code
# RNN
x_train=np.reshape(x_train, (200, 2, 1) )
x_test=np.reshape(x_test, (10, 2, 1) )
which doesn't work because the current x_train only has 200 entries, it cannot be converted into the shape (200, 2, 1). I think if someone could answer to my original question, maybe this follow-up question is resolved automatically.)
Thank you for your time.
Upvotes: 1
Views: 137
Reputation: 1242
Your for
loop seems to be the source of error. You need to ensure that x_train
has a shape (num_samples, num_timesteps)
, which in your case, is (200, 2)
. However, inside the for
loop, it looks like you are appending to x_train
only a single value, and that too in the outermost dimension (i.e., in the num_samples
dimension).
Your x_train
must be an array of arrays, where each inner array contains two elements, where the first element is randomly chosen and the second element is the one following it in seq
. The reason is, given two past values seq[ran]
and seq[ran + 1]
, you want to predict the next value.
Since you know the shape of the array in advance, a more efficient way would be to initialize the array to empty or zeros, rather than append on every iteration.
# Preallocating for efficiency.
x_train = np.zeros((num_seq, 2))
y_train = np.zeros(num_seq)
for i in range(num_seq):
# Maximum index in seq is 9.
# The last seq of length 2 takes indices 7, 8 for the value in x_train and 9 for the value in y_train
ran = np.random.randint(8)
x_train[i][0] = seq[ran]
x_train[i][1] = seq[ran + 1]
y_train[i] = seq[ran + 2]
Now your x_train
array has shape (200, 2)
, as required.
Regarding your follow-up question, RNNs in Keras take input of the shape (num_samples, num_timesteps, num_features)
. If you have a single feature in each timestep, num_features
is 1. Hence the shape (200, 2, 1)
.
You can reshape x_train
to (200, 2, 1)
using x_train = np.expand_dims(x_train, axis=2)
.
Refer: np.expand_dims
Upvotes: 1
Reputation: 247
You just need to edit your input shape size:
inputs = layers.Input(shape = (1, ))
Then try to run the code, hope this helps!
Upvotes: 1