Reputation: 4564
I have dataframe as given below:
df =
0
1 0.993995
2 1.111068
3 1.760940
.
.
.
49 40.253574
50 40.664486
51 41.083962
I am iterating through each row and print each element. My code is given below:
for idx,row in df.iterrows():
print(df[0].iloc[idx])
Present output:
1.111068
1.76094
2.691832
.
.
40.664486
41.083962
Traceback (most recent call last):
File "<ipython-input-46-80539a9081e5>", line 2, in <module>
print(darkdf[0].iloc[idx])
File "C:\Users\MM\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 1500, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Users\MM\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 2230, in _getitem_axis
self._validate_integer(key, axis)
File "C:\Users\MM\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 2139, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
Why this simple function is giving error. Someone could help me to understand what the error is saying?
Upvotes: 5
Views: 16316
Reputation: 862501
First correct way for select is use DataFrame.loc
:
print (df)
0
1 0.993995
2 1.111068
3 1.760940
for idx,row in df.iterrows():
print(df.loc[idx, 0])
0.9939950000000001
1.111068
1.7609400000000002
Problem in your solution:
If use Series.iloc
function it select by position, not by labels.
So you want select 4.th row by selecting:
df[0].iloc[3]
but there is not 4.th
(python counts from 0, so for select 4.th row need 3) row so raised error.
If use:
df[0].loc[3]
it working like you expected, because selecting index 3
(not position 4 which not exist) and column 0
, but better is use:
df.loc[idx, 0]
because evaluation order matters.
Upvotes: 4
Reputation: 106
You may want to use loc
instead of iloc
. iloc
uses the zero-based row number, not indices. Your code is passing the indices, which go over the range of the zero-based row numbers, hence out-of-bounds.
Upvotes: 3