Reputation: 4490
Sample DF:
df = pd.DataFrame(np.random.randint(1,10,size=(6,2)),columns = list("AB"))
df["A"] = ["1111","2222","1111","1111","2222","1111"]
df["B"] = ["20010101","20010101","20010101","20010101","20010201","20010201"]
df
OP:
A B
0 1111 20010101
1 2222 20010101
2 1111 20010101
3 1111 20010101
4 2222 20010201
5 1111 20010201
I am trying to find the max transactions done by the user_id in a single day.
For example, for ID: "1111" has done 3 transactions on "200010101" and 1 transaction on "20010201" so the maximum here should be 3, while the ID: 2222 has done 1 transaction on "20010101" and 1 transaction on "20010202" so the op is 1.
Expected OP:
MAX TRANS IN SINGLE DAY
1111 3
2222 1
Any pandas way to achieve this instead of creating groups and iterating through it.
Upvotes: 0
Views: 5797
Reputation: 25259
To find max you need groupby
, unstack
, max
on index
In [1832]: df.groupby(['A', 'B'])['A'].count().unstack().max(axis=1)
Out[1832]:
A
1111 3
2222 1
dtype: int64
Upvotes: 2
Reputation: 42916
We can do groupby
twice. First we get the count of each occurence in column B
of each ID in column A
. Then we groupby again and get the max value:
df2 = pd.DataFrame(df.groupby(['A', 'B'])['B'].count())\
.rename({'B':'MAX TRANS SINGLE DAY'}, axis=1)\
.reset_index()
df = df2.groupby('A', as_index=False).agg({'MAX TRANS SINGLE DAY':['max', 'min']})
print(df)
A MAX TRANS SINGLE DAY
max min
0 1111 3 1
1 2222 1 1
Upvotes: 1