Reputation: 569
I have a CSV files with all numeric values except the header row. When trying to build tensors, I get the following exception:
Traceback (most recent call last):
File "pytorch.py", line 14, in <module>
test_tensor = torch.tensor(test)
ValueError: could not determine the shape of object type 'DataFrame'
This is my code:
import torch
import dask.dataframe as dd
device = torch.device("cuda:0")
print("Loading CSV...")
test = dd.read_csv("test.csv", encoding = "UTF-8")
train = dd.read_csv("train.csv", encoding = "UTF-8")
print("Converting to Tensor...")
test_tensor = torch.tensor(test)
train_tensor = torch.tensor(train)
Using pandas
instead of Dask
for CSV parsing produced the same error. I also tried to specify dtype=torch.float64
inside the call to torch.tensor(data)
, but got the same error again.
Upvotes: 22
Views: 60274
Reputation: 1
The import functions all appear to require a .csv with an array of numbers. You mentioned in your original problem case that your .csv includes column headers. Please try your code without the headers in the .csv file.
Upvotes: 0
Reputation: 723
Only using NumPy
import numpy as np
import torch
tensor = torch.from_numpy(
np.genfromtxt("train.csv", delimiter=",")
)
Upvotes: 1
Reputation: 7693
Newer version of pandas highly recommend to use to_numpy
instead of values
train_tensor = torch.tensor(train.to_numpy())
Upvotes: 7
Reputation: 552
I think you're just missing .values
import torch
import pandas as pd
train = pd.read_csv('train.csv')
train_tensor = torch.tensor(train.values)
Upvotes: 14
Reputation: 389
Try converting it to an array first:
test_tensor = torch.Tensor(test.values)
Upvotes: 23