Reputation: 469
I have the following dataframe
import pandas as pd
df = pd.DataFrame({'Volcano Name': ['a', 'b', 'a', 'c', 'b', 'b', 'e', 'd', 'b', 'e', 'e'],
'Start Year': [1960, 1962, 1961, 1961, 1961, 1960, 1959, 1959, 1958, 1960, 1958],
'VEI': [0.0, 3.0,3.0,2.0, 3.0, 1.0, 1.0, 0.0, 2.0, 1.0, 2.0],
'Lat': [31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31]})
How can I find the percentage of each volcano by VEI? there were similar question here but couldn't figure out how to implement in mine.
I guess I should start by something like
df.groupby('VEI').count()
or
df.pivot_table( index=['Volcano Name','VEI'], columns='Volcano Name')
thank you
Upvotes: 1
Views: 97
Reputation: 658
This snippet groups your data entries by volcano name, sums up the VEI for each volcano and calculates the percentage of this value based on the sum/all VEI values. This might not be exactly what you want (see comments to your question), but the approach can hopefully and easily be adjusted to your needs.
sum_vei = df["VEI"].sum()
result = 100*df.groupby('Volcano Name')["VEI"].sum()/sum_vei)
Upvotes: 1