Error on value column in group by value counts

Question

I have code below as:

df[('name')]['cash_amount'].valuecounts(normalize=True).sum()

I want to use valuecounts normalize true, because I want to calculate the % of each names cash over the total amount of cash in the column.

Where I am trying to calculate the total number each name has in the cash_amount column, but I get error that says - KeyError: 'cash_amount'

df looks like

input:

name | cash_amount
bob      $400
chris    $500
amy      $100
amy      $100
bob      $100
bob      $100

output:

name |  % 
bob    .46
chris  .38
amy    .15

I looked for any white spaces in the column name and tried df.columns = df.columns.str.strip() and still get same error

David Erickson · Accepted Answer

First remove the $ from your string and convert to float or int. $ is a regex character, so you need to escape it with \. Then .groupby and get the percentage of total by getting the sum for each group and dividing it by the toal sum:

import pandas as pd
df = pd.DataFrame({'name': {0: 'bob', 1: 'chris', 2: 'amy', 3: 'amy', 4: 'bob', 5: 'bob'},
 'cash_amount': {0: '$400', 1: '$500', 2: '$100', 3: '$100', 4: '$100', 5: '$100'}})
df['cash_amount'] = df['cash_amount'].str.replace('\$', '').astype(float)
df = ((df.groupby('name')['cash_amount'].sum() / df['cash_amount'].sum())
      .rename('%').reset_index())
df
Out[1]: 
    name         %
0    amy  0.153846
1    bob  0.461538
2  chris  0.384615

Error on value column in group by value counts

Answers (2)

Related Questions