user4640449
user4640449

Reputation: 609

Python - pandas ==> WINDOW aggregate

I'm looking for the pandas equivalent of the SQL window aggregate function OVER()

clients = np.array(['c1', 'c2'])
df = pd.DataFrame({'client' : clients[np.random.randint(0,2,10)],\
'day_w' : np.random.randint(1,8,25) })

df.groupby(['client','day_w']).size()

I would like to add the following information to the DataFrame, for each row : Number of disctinct days per client.

Thanks in advance for your help

Upvotes: 1

Views: 423

Answers (1)

EdChum
EdChum

Reputation: 394169

I don't think you need to groupby the week day also, just call transform and pass func nunique which will return the number of distinct days:

OK I think I know what you're after now:

In [98]:

df['days per client'] = df.groupby(['client'])['day_w'].transform(pd.Series.nunique)
df
Out[98]:
   client  day_w  days per client
0      c2      3                6
1      c1      2                7
2      c1      5                7
3      c1      5                7
4      c2      4                6
5      c2      3                6
6      c2      7                6
7      c1      3                7
8      c2      2                6
9      c1      2                7
10     c1      6                7
11     c1      6                7
12     c1      4                7
13     c2      5                6
14     c1      6                7
15     c2      5                6
16     c2      7                6
17     c2      7                6
18     c1      2                7
19     c1      1                7
20     c2      2                6
21     c2      6                6
22     c1      7                7
23     c1      3                7
24     c1      4                7

Upvotes: 2

Related Questions