Reputation:
I want to build a multimodal model, for every signal sequence i have several pictures.
Example: For example i have 10 images that correspond to 5sec force data, which i want to combine in one batch. That means i want to build a model where those 10 images will be "concatenated" with the force data (for example an array of one force value per ms).
That means i have for example 10 pictures with 3 * 480 * 720 dimensions and one force data (for example an array of lenght 5000) that i want to execute in one batch.
My question is how can i combine them in PyTorch so could create a multimodal model?
I tried to build a multimodla model and i am expecting to receive a code example how it could function (combining/executing 10 pictures in one batch).
Upvotes: 2
Views: 170