how to feed multi-array data to a model?

Question

As you can see below the object n3w_coin has a method called forecast_coin() which returns a data frame that has 5 columns after removing date_time , i split the data with train_test_split and then normalize it with sc , after converting the 2D array to a 3D array that i would like to pass to the model to train on but i am having a little bit of trouble figuring out how to feed the normalized_x_train to the model

my goal is to feed every sub-array inside normalized_x_train to the model

I get the following error IndexError: tuple index out of range

please explain why and what is wrong with my approach

df = pd.DataFrame(n3w_coin.forecast_coin())

x_sth = np.array(df.drop(['date_time'],1))
y_sth = np.array(df.drop(['date_time'],1))



sc = MinMaxScaler(feature_range=(0,1))


X_train, X_test, y_train, y_test = train_test_split(x_sth,y_sth, test_size=0.2, shuffle=False)

print (X_train)
normalized_x_train = sc.fit_transform(X_train)
normalized_y_train = sc.fit_transform(y_train)

print (normalized_x_train)

### converting to a 3D array to feed the model 

normalized_x_train = np.reshape(normalized_x_train, (400 , 5 ,1 ))

print (normalized_x_train.shape)

print (normalized_x_train)

model = Sequential()
model.add(LSTM(units = 100, return_sequences = True, input_shape=(normalized_x_train.shape[5],1)))
 
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
model.fit(normalized_x_train, normalized_y_train, epochs=100, batch_size=400 )

user11530462 · Accepted Answer

Couple of observations from your code:

In addition to Normalizing the Data, you need to Prepare the Time Series Data. Please find the function below, which pre-processes the Data so that it can be fed to LSTM Model.

def multivariate_data(dataset, target, start_index, end_index, history_size,
                      target_size, step, single_step=False):
  data = []
  labels = []

  start_index = start_index + history_size
  if end_index is None:
    end_index = len(dataset) - target_size

  for i in range(start_index, end_index):
    indices = range(i-history_size, i, step)
    data.append(dataset[indices])

    if single_step:
      labels.append(target[i+target_size])
    else:
      labels.append(target[i:i+target_size])

  return np.array(data), np.array(labels)

The parameters, history_size and target_size are important. history_size indicates how many values in the Time Series needs to be considered for predicting the Target Value. target_size indicates which Future Value exactly needs to be predicted.

Your Network has only 1 LSTM Layer and you are setting the value of the argument, return_sequences = True. Value of that parameter should be True only if there is another LSTM Layer after this Layer.
Since you want to predict a Numeric Value, there should be a Dense Layer at the end with 1 Neuron/Unit/Node, and with Activation = 'linear'.

Please refer this Tensorflow Tutorial on Time Series Analysis which comprises complete code for your problem.

how to feed multi-array data to a model?

Answers (1)

Related Questions