Reputation: 380
I am trying to shape the dataframe to be able to to run a LSTM on R.
What I have is 100 list with 4 features and 10 rows per list and I want to predict 100 values. I have reshape my list into an array try to run the model but got an error similar to this
ValueError: Data cardinality is ambiguous:
x sizes: 10
y sizes: 100
I am not to understand what is the shape that I need to apply to my array to be able to make it work
I recreated my problem into a sample of data
library("keras")
#creation of the dataframe
x <- data.frame(
x1 = sample(c(0,1), replace=TRUE, size=1000),
x2 = sample(c(0,1), replace=TRUE, size=1000),
x3 = sample(c(0,1), replace=TRUE, size=1000),
x4 = sample(c(0,1), replace=TRUE, size=1000)
)
y <- data.frame( y = sample(c(0,1), replace=TRUE, size=100))
#transform into list
x_list <- list()
for(i in 1:100) {
x_list[[i]] <- x[(10*i+1) :((1+i)*10),]
}
#transform into array
arr_x <- array_reshape(as.numeric(unlist(x_list)),
dim = c(dim(x_list[[1]])[1],
dim(x_list[[1]])[2],
length(x_list) )
)
dim(x_list[[1]])[1]
dim(x_list[[1]])[2]
length(x_list)
lstm_model <- keras_model_sequential()
lstm_model %>%
layer_lstm(units = 64,
input_shape = c(10,4),
return_sequences = TRUE
)
lstm_model %>%
compile(optimizer = 'rmsprop', metrics = 'binary_crossentropy')
summary(lstm_model)
lstm_model %>% fit(
x = arr_x,
y = y,
batch_size = 1,
epochs = 20,
verbose = 0,
shuffle = FALSE
)
Upvotes: 1
Views: 514
Reputation: 380
After reading Jovan comment I understood how to solve my problem Basically the shape of my array was defined as 10,4,100 as of this
arr_x <- array_reshape(as.numeric(unlist(x_list)),
dim = c(dim(x_list[[1]])[1],
dim(x_list[[1]])[2],
length(x_list) )
)
But the array for a multivariate LSTM has to take this shape ( sample , steps, features)
On this case it should be c(100,10,4) as I have 100 of sample , 10 steps and 4 features
arr_x <- array_reshape(as.numeric(unlist(x_list)),
dim = c(length(x_list) ,
dim(x_list[[1]])[1],
dim(x_list[[1]])[2] )
)
Upvotes: 0
Reputation: 815
Your LSTM Model need an exact same number of X and Y train sizes, this is as well as the test sizes if you plan to divide it to Train and Test sets, because it is not making any sense to train a 1000 rows of X (Independent) against 100 rows of response variable Y, if you want to train dataset against 100 predictors use 100 rows of Independent variable also
1 --> So here I make an update of your sample dataset:
x <- data.frame(
x1 = sample(c(0,1), replace=TRUE, size=1000),
x2 = sample(c(0,1), replace=TRUE, size=1000),
x3 = sample(c(0,1), replace=TRUE, size=1000),
x4 = sample(c(0,1), replace=TRUE, size=1000)
)
y <- data.frame(y = sample(c(0,1), replace=TRUE, size=1000))
2 --> Then, moving to Building LSTM Model Layer and Compile, here I make some modification, I added 1 more Dense Layer after LSTM and return_sequences
to FALSE
, Based by my experiences, use return_sequences=TRUE
if you get to train a deep LSTM Model, and then use return_sequences=FALSE
to the end of a LSTM Model to return weight to another layer, add loss
parameter to the compile stage is a mandatory, this will calculate error terms of the model:
lstm_model <- keras_model_sequential()
lstm_model %>%
layer_lstm(units = 64, input_shape = c(4,1), return_sequences = FALSE) %>%
layer_dense(1, activation="relu")
lstm_model %>%
compile(optimizer = 'rmsprop',
metrics = 'binary_accuracy',
loss = "binary_crossentropy")
3 --> And for x
and y
to fit into the Model, transform it to Matrix as follows:
x_mat <- as.matrix(x)
y_mat <- as.matrix(y)
dim(x_mat) <- c(nrow(x_mat), ncol(x_mat), 1) # Make X (The Independent into 3 Dimensional)
lstm_model %>% fit(
x = x_mat,
y = y_mat,
batch_size = 2,
validation_split=0.7, #I add this so 700 datasets will be train instead of 1000 (300 will be used as validation (testing datasets))
epochs = 10,
verbose = 1,
shuffle = FALSE
)
4 --> Then the Model will Start Learning, If you see the accuracy are not really satisfying, Some tips I had noted by myself, is to try to tune it by:
A) Learning Rate parameter in the optimizer parameter, smaller learning rate eg lr=0.0001
compared by lr=0.01
are making model to learn more robust predictions
B) Change of a more Efficient Layer based by the dataset Independent and Response distribution
C) If final fit result in to prediction seems to underfitted, try to make more Deep Model rathen than Broad Model
D) Change of Activation Function as for Dense Layer, also helps for some circumstances.
Upvotes: 1