Reputation: 4448
I have a dataframe as following:
student-gender student-current-enrollment student-native-country car-owner
M BS Cuba N
F BS Brazil N
F BS US N
M MS US Y
F MS US N
M BS Cuba N
F BS Brazil N
F MS US N
F MS US N
M BS Cuba Y
F BS Brazil N
I need to find out what proportion of students from each country own a car? How do I do that?
Upvotes: 0
Views: 280
Reputation: 3639
You can use grupoby
and agg
with a lambda function:
df.groupby('student-native-country').agg(
{'car-owner': lambda x: (x == 'Y').sum() * 100 / len(df)}
)
Here the result:
car-owner
student-native-country
Brazil 0.000000
Cuba 33.333333
US 20.000000
On the other hand, if you want to know the percentage considering only people owing a car:
df.groupby('student-native-country').agg(
{'car-owner': lambda x: (x == 'Y').sum() * 100 / (df['car-owner'] == 'Y').sum()}
)
And in this case, the result is:
car-owner
student-native-country
Brazil 0.0
Cuba 50.0
US 50.0
Upvotes: 1
Reputation: 195528
print(
df.replace({"N": 0, "Y": 1})
.groupby("student-native-country")["car-owner"]
.mean()
* 100
)
Prints:
student-native-country
Brazil 0.000000
Cuba 33.333333
US 20.000000
Name: car-owner, dtype: float64
Upvotes: 1