lomingchun
lomingchun

Reputation: 37

Calculate percent value across a row in a dataframe

I have a dataframe that I set up with various sources of energy, how do I calculate the proportion of energy that each column contributed to for the total year?

year    Biomass Energy Production  Coal Production  Crude Oil Production
1949                     1.549262        11.973882             10.683252
1950                     1.562307        14.060135             11.446729
1951                     1.534669        14.419325             13.036724
1952                     1.474369        12.734313             13.281049
1953                     1.418601        12.277746             13.671076
1954                     1.394327        10.542448             13.426930
1955                     1.424143        12.369608             14.409682
1956                     1.415871        13.306334             15.180241

Upvotes: 1

Views: 4780

Answers (1)

userABC123
userABC123

Reputation: 1522

Quickly, here's how I did it:

import pandas as pd

df = pd.read_csv('energy.csv')
col_list=list(df)
col_list.remove('year')
df['total'] = df[col_list].sum(axis=1)
df1 = df.drop(['year'], axis=1)
percent = df1.div(df1.total, axis='index') * 100

>>> percent
   Biomass.Energy.Production  Coal.Production  Crude.Oil.Production  total
0                   6.400218        49.465778             44.134005    100
1                   5.771536        51.941506             42.286958    100
2                   5.293656        49.737730             44.968614    100
3                   5.363345        46.323891             48.312765    100
4                   5.183539        44.862631             49.953830    100
5                   5.497332        41.565095             52.937574    100
6                   5.049538        43.858519             51.091943    100
7                   4.734967        44.499149             50.765884    100

-------------

Edit:

df = pd.read_csv('energy.csv')
x = df.drop('year',1)
percent = pd.concat([df.year,x.div(x.sum(1),'index')*100],1)

Edit2:

df = pd.read_csv('energy.csv')
df = df.set_index(['year'])
percent = df.div(df.sum(1)/100,0)
df = df.reset_index('year')

Upvotes: 5

Related Questions