Madhi
Madhi

Reputation: 1236

tensorflow : load csv data file and training the model

I am new to tensorflow . I need to load the dataset to train my model . And sample of my dataset looks like

TRAINING_FILE.iloc[0:5,0:5]

    num_var_1   num_var_2   num_var_3   num_var_4   num_var_5
0   -0.010655   0.040182    0.0     1.800000e-07    -0.011319
1   -0.006542   0.157872    0.0     2.105000e-06    -0.010966
2   -0.010626   0.089140    0.0     3.550000e-07    -0.011286
3   -0.010626   0.227239    0.0     1.050000e-06    -0.011159
4   -0.008947   0.160410    0.0     2.105000e-06    -0.010966

I Load this csv file with mentioned code in tensorflow documentation .This is how i loaded my training file

train_fn = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename = TRAINING_FILE,
    target_dtype = np.int,
    features_dtype= np.float32)

and when i compile the script i got the following error

Traceback (most recent call last): File "train.py", line 31, in features_dtype = np.float32) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py", line 48, in load_csv_with_header n_samples = int(header[0]) ValueError: invalid literal for int() with base 10: '-0.0106550312'

Upvotes: 2

Views: 745

Answers (1)

Allen Lavoie
Allen Lavoie

Reputation: 5808

Those all look like floats, but load_csv_with_header is looking for a label column with dtype target_dtype (integer in your case). You can select this column with the target_column argument, but it's the last one by default.

So you either need to switch the label dtype to float (if you're predicting real values), or add a label column to your data.

Upvotes: 1

Related Questions