Reputation: 200
I have 4 files:train.txt,trainLabel.txt,test.txt,testLabel.txt
train.txt
1,60,feature_col0,feature_col1,feature_col2,feature_col3,feature_col4,feature_col5,feature_col6,feature_col7,feature_col8,feature_col9,feature_col10,feature_col11,feature_col12,feature_col13,feature_col14,feature_col15,feature_col16,feature_col17,feature_col18,feature_col19,feature_col20,feature_col21,feature_col22,feature_col23,feature_col24,feature_col25,feature_col26,feature_col27,feature_col28,feature_col29,feature_col30,feature_col31,feature_col32,feature_col33,feature_col34,feature_col35,feature_col36,feature_col37,feature_col38,feature_col39,feature_col40,feature_col41,feature_col42,feature_col43,feature_col44,feature_col45,feature_col46,feature_col47,feature_col48,feature_col49,feature_col50,feature_col51,feature_col52,feature_col53,feature_col54,feature_col55,feature_col56,feature_col57,feature_col58,feature_col59
1,0,0,0,0,1,0,0,1,0,0,1,0,0,1,1,0,0,1,0,0,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,1,0,0,0,1,0,0,1,0,0,1,0,0,1
trainLabel.txt
1,4,feature_col0,feature_col1,feature_col2,feature_col3
1,1,1,0
test.txt
1,60,feature_col0,feature_col1,feature_col2,feature_col3,feature_col4,feature_col5,feature_col6,feature_col7,feature_col8,feature_col9,feature_col10,feature_col11,feature_col12,feature_col13,feature_col14,feature_col15,feature_col16,feature_col17,feature_col18,feature_col19,feature_col20,feature_col21,feature_col22,feature_col23,feature_col24,feature_col25,feature_col26,feature_col27,feature_col28,feature_col29,feature_col30,feature_col31,feature_col32,feature_col33,feature_col34,feature_col35,feature_col36,feature_col37,feature_col38,feature_col39,feature_col40,feature_col41,feature_col42,feature_col43,feature_col44,feature_col45,feature_col46,feature_col47,feature_col48,feature_col49,feature_col50,feature_col51,feature_col52,feature_col53,feature_col54,feature_col55,feature_col56,feature_col57,feature_col58,feature_col59
0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,1,0,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1
testLabel.txt
1,4,feature_col0,feature_col1,feature_col2,feature_col3
1,1,0,0
dpNum means feature_col
I want to input some data like train.txt
[1 ,0..........., 1] # a rank 1 tensor; this is a vector with shape [60]
,
And predict
[1,0,0,1] # a rank 1 tensor; this is a vector with shape [4]
Upvotes: 0
Views: 902
Reputation: 2860
From the tutorials page:
# Fit model.
classifier.fit(x=training_set.data,
y=training_set.target,
steps=2000)
I.e. you can access the targets by calling training_set.target
, this should give you the label for each data point.
Also, I am not sure if you got confused with some terminology: You say that the training dataset has 15'000 data points, but only 1'000 labels, which (at least for the Iris dataset) does not make much sense as I believe that the whole dataset is labeled. Did you mean to say that you have 15'000 training samples and 1'000 test samples?
So, not sure if all of the following is already clear to you, but if not, hopefully it clears things up for you. Say the Iris dataset looks something like this (taken from Wikipedia):
Sepal length Sepal width Petal length Petal width Species
5.1 3.5 1.4 0.2 I. setosa
4.9 3.0 1.4 0.2 I. setosa
4.7 3.2 1.3 0.2 I. setosa
....
5.1 2.5 3.0 1.1 I. versicolor
5.7 2.8 4.1 1.3 I. versicolor
Now usually the following terminologies are used:
I. setosa
or I. versicolor
). Usually, labels are encoded somehow, e.g. the label is 0
for I. setosa
and 1
otherwise as you hint towards in your question. There could be more than just those two possible labels, though. E.g. in the Iris dataset there is usually also a third flower called I. virginica
.Upvotes: 1