Reputation: 5828
I have a Panda DataFrame that looks like this:
country year cases population
0 Afghanistan '99 745 19987071
1 Brazil '99 37737 172006362
2 China '99 212258 1272915272
3 Afghanistan '00 2666 20595360
4 Brazil '00 80488 174504898
5 China '00 213766 1280428583
I want to add a new column called 'prevalence' that is the row's cases divided by population. This line of code works:
G['prevalence'] = G['cases'] / G['population']
However, I want to do the same thing using Panda's apply. Here is what I'm trying to do:
def get_prev (x, y):
return x / y
def calc_prevalence(G):
assert 'cases' in G.columns and 'population' in G.columns
###
### YOUR CODE HERE
to_return = G.copy()
new_column = to_return.apply(lambda x: get_prev(to_return.population, to_return.cases), axis=1)
to_return['prevalence'] = new_column
return to_return
###
#G_copy = G.copy()
H = calc_prevalence(G)
I'm getting a KeyError: 'prevalence'
Any ideas what I'm doing wrong?
Upvotes: 1
Views: 255
Reputation: 8033
It can be done simply with the code below
def func(x):
res = x['cases']/x['population']
return res
df['prevalence'] = df.apply(func, axis=1)
Output
country year cases population prevalence
0 Afghanistan '99 745 19987071 0.000037
1 Brazil '99 37737 172006362 0.000219
2 China '99 212258 1272915272 0.000167
3 Afghanistan '00 2666 20595360 0.000129
4 Brazil '00 80488 174504898 0.000461
5 China '00 213766 1280428583 0.000167
Upvotes: 2