Reputation: 629
I've taken the prefabricated code that trains on the Iris csv and attempted to use my own csv.
The error is occurring here
train_data = "train_data.csv"
test_data = "test_data.csv"
training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
filename=train_data,
target_dtype=np.int,
features_dtype=np.float32)
with the error
ValueError: invalid literal for int() with base 10: 'feature1'
the csv looks like this
feature1,feature2,feature3,label
1028.0,1012.0,1014.0,1
1029.0,1011.0,1017.0,-1
1027.0,1013.0,1015.0,1
...(and so on)
I get that the error is trying to say that feature1 is not an integer. However, when I use the same code for the Iris dataset, there are string headers that are not used as tensors. The Iris data csv looks like this.
30,4,setosa,versicolor,virginica
5.9,3.0,4.2,1.5,1
6.9,3.1,5.4,2.1,2
5.1,3.3,1.7,0.5,0
Also, not sure if I should make this a different question, but I changed the feature headers to
1,2,3,4
1028.0,1012.0,1014.0,1
1029.0,1011.0,1017.0,-1
1027.0,1013.0,1015.0,1
...(and so on)
and am now getting this error
ValueError: could not broadcast input array from shape (3) into shape (2)
Any ideas or help are greatly appreciated! Thanks!!!
Upvotes: 0
Views: 532
Reputation: 487
If you are going to use this function, you have to write the dataset in the expected format. The first row should be like:
n_samples, n_features, [feature names]
For example, the one for the iris dataset you are showing has the correct format:
30,4,setosa,versicolor,virginica
i.e. 30 samples 4 features
If you have 50 samples in the dataset you created it should be like:
50,4,labelname
1028.0,1012.0,1014.0,1
1029.0,1011.0,1017.0,-1
1027.0,1013.0,1015.0,1
...(and so on)
Upvotes: 2