Reputation: 73
I need to do some maths using the following dataframe. In a for loop iterating through VALUE column cells, I need to grab the corresponding FracDist.
VALUE FracDist
0 11 0.022133
1 21 0.021187
2 22 0.001336
3 23 0.000303
4 24 0.000015
5 31 0.000611
6 41 0.040523
7 42 0.285630
8 43 0.161956
9 52 0.296993
10 71 0.160705
11 82 0.008424
12 90 0.000130
13 95 0.000053
First I made a list of VALUE values which I can use in a for loop, which worked as expected:
IN: LCvals = df['VALUE'].tolist()
print LCvals
OUT: [11, 21, 22, 23, 24, 31, 41, 42, 43, 52, 71, 82, 90, 95]
When I try to grab a cell from the dataframe's FracDist column based on which VALUE row the for loop is on, that is where a problem comes up. Instead of looking up rows using VALUE from the VALUE column, the code is trying to lookup rows using VALUE as the index. So what I get:
IN: for val in LCvals:
print val
print LCdf.loc[val]['FracDist']
OUT: 11
0.00842444155517
21
KeyError: 'the label [21] is not in the [index]'
Note that the FracDist row that is grabbed for VALUE=11 is from index 11, not VALUE 11.
What needs to change in that for loop code to query rows based on VALUE in the VALUE column rather than VALUE as a spot in the index?
Upvotes: 0
Views: 47
Reputation: 164623
Here pd.DataFrame.loc
will index first by row label and then, if a second argument is supplied, by column label. This is by design. See also Indexing and Selecting Data.
Don't, under any circumstances use chained indexing. For example, Boolean indexing followed by column label selection via LCdf.loc[LCdf['VALUE']==val]['FracDist']
is not recommended.
If you wish to iterate a single series, you can use pd.Series.items
. But here you are using 'VALUE'
as if it were an index, so you can use set_index
first:
for val, dist in df.set_index('VALUE')['FracDist'].items():
print(val, dist)
11 0.022133
21 0.021187
...
90 0.00013
95 5.3e-05
Upvotes: 2
Reputation: 1704
If you pass in an integer into .loc
, it will return (in this case) a value located at that index. You could use this LCdf.loc[LCdf['VALUE']==val]['FracDist']
.
Edit: Here is a better (more efficient) answer:
for index, row in LCdf.iterrows():
print(row['VALUE'])
print(row['FracDist'])
Upvotes: 1