user5779223
user5779223

Reputation: 1490

Cannot get right slice bound for non-unique label when indexing data frame with python-pandas

I have such a data frame df:

a         b
10        2
3         1
0         0
0         4
....
# about 50,000+ rows

I wish to choose the df[:5, 'a']. But When I call df.loc[:5, 'a'], I got an error: KeyError: 'Cannot get right slice bound for non-unique label: 5. When I call df.loc[5], the result contains 250 rows while there is just one when I use df.iloc[5]. Why does this thing happen and how can I index it properly? Thank you in advance!

Upvotes: 8

Views: 22768

Answers (3)

timmy
timmy

Reputation: 101

To filter with non-unique indexs try something like this: df.loc[(df.index>0)&(df.index<2)]

Upvotes: 10

Sujith Rao
Sujith Rao

Reputation: 17

The issue with the way you are addressing is that, there are multiple rows with index as 5. So the loc attribute does not know which one to pick. To know just do a df.loc[5] you will get number of rows with same index. Either you can sort it using sort_index or you can first aggregate data based on index and then retrieve. Hope this helps.

Upvotes: 0

Stefan
Stefan

Reputation: 42905

The error message is explained here: if the index is not monotonic, then both slice bounds must be unique members of the index.

The difference between .loc and .iloc is label vs integer position based indexing - see docs. .loc is intended to select individual labels or slices of labels. That's why .loc[5] selects all rows where the index has the value 250 (and the error is about a non-unique index). iloc, in contrast, select row number 5 (0-indexed). That's why you only get a single row, and the index value may or may not be 5. Hope this helps!

Upvotes: 8

Related Questions