Sam B.
Sam B.

Reputation: 3033

tensorflow/keras training model keyerror

Ok, from the top here's the imports that I use

import keras
from keras import layers
from keras.models import Sequential
import pandas as pd
from sklearn.model_selection import train_test_split

I then get the data from a csv using pandas and then split the necessary fields into X and y and also split it into train and test set.

df = pd.read_csv('data/BCHAIN-NEW.csv')
y = df['Predict']
X = df[['Value USD', 'Drop 7', 'Up 7', 'Mean Change 7', 'Change']]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, shuffle=False)

This is without shuffling so the data is split evenly

X_test.head()
>>>
        Value USD   Drop 7  Up 7    Mean Change 7   Change
2320    1023.14     5.0     2.0     -22.754286      -103.62
2321    1126.76     5.0     2.0     -4.470000       132.09
2322    994.67      5.0     2.0     9.865714        111.58
2323    883.09      5.0     2.0     9.005714        -13.74
2324    896.83      5.0     2.0     12.797143       -11.31

X_train.head()
>>>
    Value USD   Drop 7  Up 7    Mean Change 7   Change
0   0.06480     2.0     4.0     -0.000429       -0.00420
1   0.06900     1.0     5.0     0.000274        0.00403
2   0.06497     1.0     5.0     0.000229        0.00007
3   0.06490     1.0     5.0     0.000514        0.00200
4   0.06290     2.0     4.0     0.000229        -0.00050

running the model like so now throws the index error

model = Sequential()
model.add(layers.Dense(100, activation='relu', input_shape=(5,)))
model.add(layers.Dense(100, activation='relu'))
model.add(layers.Dense(5, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=3)

>>>
Epoch 1/3

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-38-868bc86350df> in <module>()
      4 model.add(layers.Dense(5, activation='softmax'))
      5 model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
----> 6 model.fit(X_train, y_train, epochs=3)

c:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\keras\models.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs)

...

c:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\pandas\core\indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
   1267                 if mask.any():
   1268                     raise KeyError('{mask} not in index'
-> 1269                                    .format(mask=objarr[mask]))
   1270 
   1271                 return _values_from_object(indexer)

KeyError: '[1330  480  101 2009 1131  379 1498 2188 2121  700 1877 2011 2244 1262\n 1493  956  150  479 1345 1073 1173 1909 2260 2288  355  670 2143 1426\n   42  952  358 1183] not in index'

Upvotes: 0

Views: 2943

Answers (1)

VegardKT
VegardKT

Reputation: 1246

It seems to me that your data is in the wrong format, the need to be numpy arrays. (assuming they are not allready numpy arrays)

Try converting them like so

x_train = np.array(x_train)
y_train = np.array(y_train)

Upvotes: 2

Related Questions