user12983086
user12983086

Reputation:

How to map the values in dataframe in pandas using python

have a df with values

df



name     marks     

mark      10
mark      40
tom       25
tom       20
mark      50
tom        5
tom       50
tom       25
tom       10
tom       15

How to sum the marks of names and count how many times it took

expected_output:

name     total     count


mark     100         3

tom      150         7

Upvotes: 1

Views: 51

Answers (3)

Roy2012
Roy2012

Reputation: 12523

Here's a solution. I'm doing it step-by-step for simplicity:

df["commulative_sum"] = df.groupby("name").cumsum()
df["commulative_sum_50"] = df["commulative_sum"] // 50
df["commulative_count"] = df.assign(one = 1).groupby("name").cumsum()["one"]

res = pd.pivot_table(df, index="name", columns="commulative_sum_50", values="commulative_count", aggfunc=min).drop(0, axis=1)

# the following two lines can be done in a loop if there are a lot of columns. I simplified it here. 
res[3] = res[3]-res[2]
res[2] = res[2]-res[1]
res.columns = ["50-" + str(c) for c in res.columns]

The result is:

      50-1  50-2  50-3
name                  
mark   2.0   1.0   NaN
tom    3.0   1.0   3.0

Upvotes: 1

Bertil Johannes Ipsen
Bertil Johannes Ipsen

Reputation: 1776

Edit: I see you changed the "to 50" part, this might not be revelant anymore.

I agree with Jezrael's answer, but in case you only want to count to 50, maybe something like this:

l = []
for name, group in df.groupby('name')['marks']:
    l.append({'name': name, 
              'total': group.sum(), 
              'count': group.loc[:group.eq(50).idxmax()].count()})
pd.DataFrame(l)

   name  total  count
0  mark    100      3
1   tom    150      4

Upvotes: 0

jezrael
jezrael

Reputation: 863166

Here is possible use aggregate by named aggregations:

df = df.groupby('name').agg(total=('marks','sum'),
                            count=('marks','size')).reset_index()
print (df)
   name  total  count
0  mark    100      3
1   tom    150      7

Or with specify column after groupby and pass tuples:

df = df.groupby('name')['marks'].agg([('total', 'sum'),
                                      ('count','size')]).reset_index()
print (df)
   name  total  count
0  mark    100      3
1   tom    150      7

Upvotes: 1

Related Questions