Reputation: 2952
I want to slice my data in Python. The very basic task to slice my dataframe throws unexpected errors at me.
My code is:
import pandas as pd
test_file = pd.read_csv("C:/Users/Lenovo/Desktop/testfile.csv")
test_select = test_file[["Category", "Shop"]]
print(test_select[1,1])
The code print(test_select[1,1])
should display the second row of the second column.
The error message:
File "pandas_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: (1, 1)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:/Users/Lenovo/.PyCharmCE2018.1/config/scratches/Dictionary.py", line 8, in print(h_select[1,1]) File "C:\Users\Lenovo\PycharmProjects\mindnotez\venv\lib\site-packages\pandas\core\frame.py", line 2688, in getitem return self._getitem_column(key) File "C:\Users\Lenovo\PycharmProjects\mindnotez\venv\lib\site-packages\pandas\core\frame.py", line 2695, in _getitem_column return self._get_item_cache(key) File "C:\Users\Lenovo\PycharmProjects\mindnotez\venv\lib\site-packages\pandas\core\generic.py", line 2489, in _get_item_cache values = self._data.get(item) File "C:\Users\Lenovo\PycharmProjects\mindnotez\venv\lib\site-packages\pandas\core\internals.py", line 4115, in get loc = self.items.get_loc(item) File "C:\Users\Lenovo\PycharmProjects\mindnotez\venv\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: (1, 1)
When I print print(test_select.head())
, I get the following output:
Category Shop
0 Jidlo Albert
1 Jidlo BILLA
2 Jidlo Albert
3 Jidlo Albert
4 Restaurant Kockafé Freyd
Slicing the dataframe like print(test_select[1:4])
, prints row 1:3.
With the command print(test_select[1,1])
, I want the second column, second row. However, I receive the error message above.
Why do I receive the KeyError exception? What am I missing?
I use:
Upvotes: 1
Views: 3011
Reputation: 3902
When you want to slice a dataframe
By row number
df.iloc[[1, 5]] # to get rows 1 and 5
df.iloc[1:6] # to get rows 1 to 5 inclusive
You can also narrow it down to a specific column as follows (to avoid chain indexing)
df.iloc[[1, 5], df.columns.get_loc('Shop')]
or multiple columns
df.iloc[[1, 5], df.columns.get_indexer(['Shop', 'Category'])]
By label based index
# Numeric
df.loc[[1, 5]] # 1 and 5 are considered labels here
df.loc[[1, 5], 'Shop']
df.loc[[1, 5], ['Shop', 'Category']]
# Textual or otherwise
df.set_index('Shop', inplace=True)
df.loc[['BILLA', 'Albert'], 'Category']
Upvotes: 4
Reputation: 1
If you want second row second column you have to use: df.iloc[1,1] iloc extracts data based on index
[1,1] takes the first row index and first column index. output would be 'BILLA'
Upvotes: 0
Reputation: 323226
Using loc
this is using index and column rather than the position , here looks like your index is from 0 to n so that loc
is equal to iloc
when slice the row
df.loc[1,'Shop']
'BILLA'
Upvotes: 2
Reputation: 164663
The code
print(test_select[1,1])
should display the second row of the second column.
No, it shouldn't. The syntax df[x]
is usually reserved for retrieving a column (series), Boolean row indexing, or row slicing. These uses of pd.DataFrame.__getitem__
, for which df[]
is syntactic sugar, aren't conveniently documented. In general, they should be considered shortcuts, and if you are unsure you should prefer loc
/ iloc
/ at
/ iat
, as appropriate.
To retrieve a scalar value via integer positional indexing, you can use pd.DataFrame.iat
:
df.iat[1, 1]
Upvotes: 3