Reputation: 1252
Goal: I am working with RNNs in PyTorch, and my data is given by a list of DataFrames, where each DataFrame means one observation like:
import numpy as np
data = [pd.DataFrame(np.zeros((5,50))) for x in range(100)]
which means 100 observation, with 50 parameters and 5 timesteps each. For my Model i need a tensor of shape (100,5,50)
.
Issue: I tried a lot of things but nothing seems to work, does anyone know how this is done? This approaches doesn't work:
import torch
torch.tensor(np.array(data))
I thing the problem is to convert the DataFrames into Arrays and the List into a Tensor at the same time.
Upvotes: 1
Views: 6727
Reputation: 84
it's too late, but if someonelse is still asking this question somewhere... This is for you <3
import torch
import numpy as np
list_of_dataframe : List[pd.DataFrame] #= ....
my_tensor = torch.tensor(np.array(list_of_dataframe))
(python 3.9, numpy 1.20, pytorch 1.10)
Upvotes: 0
Reputation: 4719
I don't think you can convert the list of dataframes in a single command, but you can convert the list of dataframes into a list of tensors and then concatenate the list.
E.g.
import pandas as pd
import numpy as np
import torch
data = [pd.DataFrame(np.zeros((5,50))) for x in range(100)]
list_of_arrays = [np.array(df) for df in data]
torch.tensor(np.stack(list_of_arrays))
#or
list_of_tensors = [torch.tensor(np.array(df)) for df in data]
torch.stack(list_of_tensors)
Upvotes: 1