cr0
cr0

Reputation: 73

Trying to call a cell value, why is my list of column values being interpreted as index values?

I need to do some maths using the following dataframe. In a for loop iterating through VALUE column cells, I need to grab the corresponding FracDist.

    VALUE  FracDist
0      11  0.022133
1      21  0.021187
2      22  0.001336
3      23  0.000303
4      24  0.000015
5      31  0.000611
6      41  0.040523
7      42  0.285630
8      43  0.161956
9      52  0.296993
10     71  0.160705
11     82  0.008424
12     90  0.000130
13     95  0.000053

First I made a list of VALUE values which I can use in a for loop, which worked as expected:

IN: LCvals = df['VALUE'].tolist()
    print LCvals
OUT: [11, 21, 22, 23, 24, 31, 41, 42, 43, 52, 71, 82, 90, 95]

When I try to grab a cell from the dataframe's FracDist column based on which VALUE row the for loop is on, that is where a problem comes up. Instead of looking up rows using VALUE from the VALUE column, the code is trying to lookup rows using VALUE as the index. So what I get:

IN:    for val in LCvals:
            print val
            print LCdf.loc[val]['FracDist']

OUT:    11
        0.00842444155517
        21
        KeyError: 'the label [21] is not in the [index]'

Note that the FracDist row that is grabbed for VALUE=11 is from index 11, not VALUE 11.

What needs to change in that for loop code to query rows based on VALUE in the VALUE column rather than VALUE as a spot in the index?

Upvotes: 0

Views: 47

Answers (2)

jpp
jpp

Reputation: 164623

Here pd.DataFrame.loc will index first by row label and then, if a second argument is supplied, by column label. This is by design. See also Indexing and Selecting Data.

Don't, under any circumstances use chained indexing. For example, Boolean indexing followed by column label selection via LCdf.loc[LCdf['VALUE']==val]['FracDist'] is not recommended.

If you wish to iterate a single series, you can use pd.Series.items. But here you are using 'VALUE' as if it were an index, so you can use set_index first:

for val, dist in df.set_index('VALUE')['FracDist'].items():
    print(val, dist)

11 0.022133
21 0.021187
...
90 0.00013
95 5.3e-05

Upvotes: 2

Joe Patten
Joe Patten

Reputation: 1704

If you pass in an integer into .loc, it will return (in this case) a value located at that index. You could use this LCdf.loc[LCdf['VALUE']==val]['FracDist'].

Edit: Here is a better (more efficient) answer:

for index, row in LCdf.iterrows():
    print(row['VALUE'])
    print(row['FracDist'])

Upvotes: 1

Related Questions