b-ryce
b-ryce

Reputation: 5828

Using Panda's Apply to calculate a value - KeyError

I have a Panda DataFrame that looks like this:

country year    cases   population
0   Afghanistan '99 745 19987071
1   Brazil  '99 37737   172006362
2   China   '99 212258  1272915272
3   Afghanistan '00 2666    20595360
4   Brazil  '00 80488   174504898
5   China   '00 213766  1280428583

I want to add a new column called 'prevalence' that is the row's cases divided by population. This line of code works:

G['prevalence'] = G['cases'] / G['population']

However, I want to do the same thing using Panda's apply. Here is what I'm trying to do:

def get_prev (x, y):
    return x / y

def calc_prevalence(G):
    assert 'cases' in G.columns and 'population' in G.columns
    ###
    ### YOUR CODE HERE
    to_return = G.copy()
    new_column = to_return.apply(lambda x: get_prev(to_return.population, to_return.cases), axis=1)
    to_return['prevalence'] = new_column
    return to_return
    ###

#G_copy = G.copy()
H = calc_prevalence(G)

I'm getting a KeyError: 'prevalence'

Any ideas what I'm doing wrong?

Upvotes: 1

Views: 255

Answers (1)

moys
moys

Reputation: 8033

It can be done simply with the code below

def func(x):
    res = x['cases']/x['population']
    return res
df['prevalence'] = df.apply(func, axis=1)

Output

        country     year    cases   population  prevalence
0   Afghanistan     '99     745     19987071    0.000037
1   Brazil          '99     37737   172006362   0.000219
2   China           '99     212258  1272915272  0.000167
3   Afghanistan     '00     2666    20595360    0.000129
4   Brazil          '00     80488   174504898   0.000461
5   China           '00     213766  1280428583  0.000167

Upvotes: 2

Related Questions