How to combine multiple images with one signal data in a dataset (Python/PyTorch/MultiModal)

Question

I want to build a multimodal model, for every signal sequence i have several pictures.

Example: For example i have 10 images that correspond to 5sec force data, which i want to combine in one batch. That means i want to build a model where those 10 images will be "concatenated" with the force data (for example an array of one force value per ms).

That means i have for example 10 pictures with 3 * 480 * 720 dimensions and one force data (for example an array of lenght 5000) that i want to execute in one batch.

My question is how can i combine them in PyTorch so could create a multimodal model?

I tried to build a multimodla model and i am expecting to receive a code example how it could function (combining/executing 10 pictures in one batch).

How to combine multiple images with one signal data in a dataset (Python/PyTorch/MultiModal)

Answers (0)

Related Questions