Reputation: 1236
I am new to tensorflow . I need to load the dataset to train my model . And sample of my dataset looks like
TRAINING_FILE.iloc[0:5,0:5]
num_var_1 num_var_2 num_var_3 num_var_4 num_var_5
0 -0.010655 0.040182 0.0 1.800000e-07 -0.011319
1 -0.006542 0.157872 0.0 2.105000e-06 -0.010966
2 -0.010626 0.089140 0.0 3.550000e-07 -0.011286
3 -0.010626 0.227239 0.0 1.050000e-06 -0.011159
4 -0.008947 0.160410 0.0 2.105000e-06 -0.010966
I Load this csv file with mentioned code in tensorflow documentation .This is how i loaded my training file
train_fn = tf.contrib.learn.datasets.base.load_csv_with_header(
filename = TRAINING_FILE,
target_dtype = np.int,
features_dtype= np.float32)
and when i compile the script i got the following error
Traceback (most recent call last): File "train.py", line 31, in features_dtype = np.float32) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py", line 48, in load_csv_with_header n_samples = int(header[0]) ValueError: invalid literal for int() with base 10: '-0.0106550312'
Upvotes: 2
Views: 745
Reputation: 5808
Those all look like floats, but load_csv_with_header
is looking for a label column with dtype target_dtype
(integer in your case). You can select this column with the target_column
argument, but it's the last one by default.
So you either need to switch the label dtype to float (if you're predicting real values), or add a label column to your data.
Upvotes: 1