Reputation: 25
I have the following data:
country | code | continent | plants | invertebrates | vertebrates | total |
---|---|---|---|---|---|---|
Afghanistan | AFG | Asia | 5 | 2 | 33 | 40 |
Albania | ALB | Europe | 5 | 71 | 61 | 137 |
Algeria | DZA | Africa | 24 | 40 | 81 | 145 |
I want to add a hemisphere column that is determined by the continent that references a list. I want to do it using a custom function (and not using lambda).
I attempted the following:
northern = ['North America', 'Asia', 'Europe']
southern = ['Africa','South America', 'Oceania']
def hem(x,y):
if y in northern:
x = 'northern'
return x
elif y in southern:
x = 'southern'
return x
else:
x = 'Not Found'
return x
species_custom['hemisphere'] = species_custom.apply(hem, args=(species_custom['continent'],), axis=1)
I receive the following error:
ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 0')
Any help is greatly appreciated.
Upvotes: 0
Views: 41
Reputation: 3720
hem
is defined as taking two arguments but in the apply
you only pass one. And when you do you are passing the full continent
column to it. Probably not what you want.
You could simplify by using nested numpy
where
.
import numpy as np
df['hemisphere'] = np.where(df['continent'].isin(northern), 'northern', np.where(df['continent'].isin(southern),'southern','Not Found'))
Result
country code continent plants invertebrates vertebrates total hemisphere
0 Afghanistan AFG Asia 5 2 33 40 northern
1 Albania ALB Europe 5 71 61 137 northern
2 Algeria DZA Africa 24 40 81 145 southern
Upvotes: 0