Reputation: 14689
I get an error when trying to access a single element in a pandas dataframe this way test_df["LABEL"][0]
. Here is a code snippet on how I am loading the data:
print "reading test set"
test_set = pd.read_csv(data_path+"small_test_products.txt", header=0, delimiter="|")
print "shape of the test set", test_set.shape
test_df = pd.DataFrame(test_set)
lengthOfTestSet = len(test_df["LABEL"])
print test_df["LABEL"][0]
Here is the error I am getting:
File "code.py", line 80, in <module>
print test_df["LABEL"][0]
File "/usr/local/lib/python2.7/dist-packages/pandas/core/series.py", line 521, in __getitem__
result = self.index.get_value(self, key)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 3562, in get_value
loc = self.get_loc(k)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 3619, in get_loc
return super(Float64Index, self).get_loc(key, method=method)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 1572, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3824)
File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)
File "pandas/hashtable.pyx", line 541, in pandas.hashtable.Float64HashTable.get_item (pandas/hashtable.c:9914)
File "pandas/hashtable.pyx", line 547, in pandas.hashtable.Float64HashTable.get_item (pandas/hashtable.c:9852)
KeyError: 0.0
What am I missing?
Upvotes: 10
Views: 117528
Reputation: 23071
In the case in the OP, the variable name test_df
suggests that it was created by splitting a dataframe into train and test sets, so it's very likely that test_df
didn't have index=0
. You can check it by
0 in test_df.index
and if it return False then there isn't index=0
.
Nevertheless, to access the first row, you can use test_df.iloc
or test_df.take()
(similar to numpy.take
) or even loc
:
test_df.take([0])
test_df.iloc[0]
test_df.loc[test_df.index[0]]
For a scalar value, you can even use iat
:
test_df["LABEL"].iat[0]
If the index is not important and you want to reset it to a range index, then as Seth suggests, reset the index; just make sure to assign the result back (so that the change is permanent).
test_df = test_df.reset_index() # the old index becomes a column in the dataframe
test_df = test_df.reset_index(drop=True) # the old index is thrown away
You may also get a key error for columns if the dataframe doesn't have a column with a specific name. A common culprit is a leading/trailing white space, e.g. 'LABEL '
instead of 'LABEL'
. The following should return True for you to select LABEL column.
'LABEL' in test_df.columns
If the above returns False, try
test_df.columns = test_df.columns.str.strip()
and try selecting via test_df['LABEL']
again.
Upvotes: 1
Reputation: 466
Like EdChum said 0 is probably not in your index.
Try: df.iloc[0]
or df['label'].iloc[0]
, which is integer based location.
To reset the index if you are having trouble with that: df.reset_index(drop=True)
Check out panda's indexing doc for more information on it
Upvotes: 18