sam_sam
sam_sam

Reputation: 469

How can I find the percentage of something in a column of dataframe in Python?

I have the following dataframe

import pandas as pd

df = pd.DataFrame({'Volcano Name': ['a', 'b', 'a', 'c', 'b', 'b', 'e', 'd', 'b', 'e', 'e'],
                   'Start Year': [1960, 1962, 1961, 1961, 1961, 1960, 1959, 1959, 1958, 1960, 1958],
                   'VEI': [0.0, 3.0,3.0,2.0, 3.0, 1.0, 1.0, 0.0, 2.0, 1.0, 2.0],
                   'Lat': [31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31]})

How can I find the percentage of each volcano by VEI? there were similar question here but couldn't figure out how to implement in mine.

I guess I should start by something like

df.groupby('VEI').count()

or

df.pivot_table( index=['Volcano Name','VEI'], columns='Volcano Name')

thank you

Upvotes: 1

Views: 97

Answers (1)

BenB
BenB

Reputation: 658

This snippet groups your data entries by volcano name, sums up the VEI for each volcano and calculates the percentage of this value based on the sum/all VEI values. This might not be exactly what you want (see comments to your question), but the approach can hopefully and easily be adjusted to your needs.

sum_vei = df["VEI"].sum()
result = 100*df.groupby('Volcano Name')["VEI"].sum()/sum_vei)

Upvotes: 1

Related Questions