How to input the sequence of the images in to LSTM network for video classification

Question

I am using LSTM to classify video. I am using Keras python library to create Long Term Short Memory LSTM network. I understand that LSTM takes the input shape of the data in (sample, timestamp, Features). I have three class of video and each of these class has 10 video files. This means that I have 10*3=30 samples. I have created a sequence of the frame for each of these video files. Each of these sequences consists of 32frames of video files. I use the trained model to extract features, so I feed each of these frames into VGG16 pre-trained model it that generates 512 features for a single frame. so one video files should have an array of (32,512) dimensions. I then append in each of these arrays into a single array for all the 30 samples and save it as numpy array. the final dimension of the array is (960,512). Now my problem is how should I reshape this array into(sample,timestamp,features) =(20,32,512). This is the snippet of code I used: Please note that x_generator has 640,512 and I wish to convert it as (30,32,512). I would appreciate solving my problem.

 x_generator=None
 if x_generator is None:
    imgx = image.img_to_array(img)
    imgx = np.expand_dims(imgx, axis=0)
    imgx = preprocess_input(imgx)
    x_generator = base_model.predict(imgx)
 else
    imgx = image.img_to_array(img)
    imgx = np.expand_dims(imgx, axis=0)
    imgx = preprocess_input(imgx)
    x_generator = np.append(x_generator,base_model.predict(imgx),axis=0)

remeus · Accepted Answer

If you got the 960 values by appending the 30 samples of dimension (32, 512), you can just use np.reshape to reshape the array with the expected dimensions.

x_generator = np.reshape(x_generator, [30, 32, 512])

How to input the sequence of the images in to LSTM network for video classification

Answers (1)

Related Questions