Reputation: 33
While there are questions on stackoverflow that somewhat specify the same queries i'm looking to make on a pandas dataframe, none have been able to identify MULTIPLE rows that share the same values.
To explain my issue, I have a dataframe with info on the people/time slot for when they decide to use the gym. It looks like this,
,User,Time,Date
0, User 1 ,12:00PM ,10/5/20 (Identical 3 times)
1, User 2 ,12:00PM ,10/5/20 (Identical 3 times)
2, User 3 ,12:00PM ,10/5/20
3, User 1 ,1:00PM ,10/4/20 (Identical 2 times)
4, User 2 ,1:00PM ,10/4/20 (Identical 2 times)
5, User 5 ,1:00PM ,10/4/20
6, User 6 ,1:00PM ,10/4/20
7, User 7 ,12:00PM ,10/4/20
9, User 1 ,11:00AM ,10/4/20 (Identical 1 time)
10, User 2 ,11:00AM ,10/4/20 (Identical 1 time)
11, User 3 ,10:00AM ,10/4/20
12, User 6 ,10:00AM ,10/4/20
13, User 7 ,10:00AM ,10/4/20
My goal is to create a dataframe that would group the "Names" column by the "Time" and "date" for every row in the frame, which would create a count of Users that share the same signup time/date as another User. By doing that on the dataset above, it should look somewhat like-
,User, User, Count of identical gym times
0, User 1 , User 2, 3
1, User 3 , User 1, 1
2, User 3 , User 2, 1
3, User 1 , User 5, 1
4, User 2 , User 5, 1
5, User 2 , User 6, 1
6, User 3 , User 6, 1
7, User 3 , User 7, 1
8, User 4 , User 6, 1
9, User 4 , User 7, 1
I followed a few guides that tried to count the amount of times that rows are similar,
df.groupby('Date').User.nunique()
would return,
Date
2020-08-20 6
2020-08-21 13
2020-08-22 15
2020-08-23 18
2020-08-24 25
2020-08-25 24
2020-08-26 24
2020-08-27 24
2020-08-28 20
2020-08-29 12
2020-08-30 8
This doesnt help. Are there any other guides for this "query"?
Upvotes: 0
Views: 195
Reputation: 2583
Does this output helps?
df.groupby(['Time','Date'],as_index=False).agg({'User':lambda x: [item for item in x]})
Time Date User
0 10:00AM 10/4/20 [User3, User6, User7]
1 11:00AM 10/4/20 [User1, User2]
2 12:00PM 10/4/20 [User7]
3 12:00PM 10/5/20 [User1, User2, User3]
4 1:00PM 10/4/20 [User1, User2, User5, User6]
Upvotes: 0