Reputation: 21542
I'm working on a Dataframe df:
Datetime,User
2013-12-04 08:00:01,111
2013-12-04 09:00:02,111
2013-12-04 10:00:03,111
2013-12-04 09:00:04,112
2013-12-04 10:00:05,112
2013-12-04 11:00:06,112
2013-12-04 11:00:07,113
2013-12-04 11:00:08,113
2013-12-04 11:00:09,113
2013-12-04 13:00:10,114
2013-12-04 13:00:11,113
2013-12-04 12:01:11,115
2013-12-04 12:01:11,115
2013-12-04 12:01:11,115
2013-12-04 12:01:11,115
2013-12-04 12:01:11,115
2013-12-04 12:01:11,115
2013-12-04 12:01:11,115
with User
- Datetime
information. I would like to drop Users under certain Datetime criteria, for instance when they are present more than, let's say, 3 or more times in the same minute of the same hour of the same day. Under this condition, Users 113 and 115 should be dropped out of the DataFrame. So far I tried to groupby the User
column and to get information about the datatime object, but with no results.
Upvotes: 0
Views: 843
Reputation: 3928
There is probably a nicer way to do this, but that's how I would do it:
import pandas as pd
# First set up the dataframe
Datetime = ['2013-12-04 08:00:01',
'2013-12-04 09:00:02',
'2013-12-04 10:00:03',
'2013-12-04 09:00:04',
'2013-12-04 10:00:05',
'2013-12-04 11:00:06',
'2013-12-04 11:00:07',
'2013-12-04 11:00:08',
'2013-12-04 11:00:09',
'2013-12-04 13:00:10',
'2013-12-04 13:00:11',
'2013-12-04 12:01:11',
'2013-12-04 12:01:11',
'2013-12-04 12:01:11',
'2013-12-04 12:01:11',
'2013-12-04 12:01:11',
'2013-12-04 12:01:11',
'2013-12-04 12:01:11']
user = [111, 111, 111, 112, 112, 112, 112, 113, 113, 113, 114, 113, 115, 115, 115,
115, 115, 115]
Datetime = [pd.to_datetime(t) for t in Datetime]
df = pd.DataFrame(data={'user':user}, index=Datetime)
df['count_user'] = 1
df['hour'] = df.index.hour
df['min'] = df.index.minute
df['time'] = df.index
df = df.groupby(['hour', 'min', 'user', 'time']).sum()
df = df[df.count_user < 3]
df.reset_index(inplace=True)
df = df.set_index('time')
df.drop(['count_user', 'hour', 'min'], 1, inplace=True)
print df
user
time
2013-12-04 08:00:01 111
2013-12-04 09:00:02 111
2013-12-04 09:00:04 112
2013-12-04 10:00:03 111
2013-12-04 10:00:05 112
2013-12-04 11:00:06 112
2013-12-04 11:00:07 112
2013-12-04 11:00:08 113
2013-12-04 11:00:09 113
2013-12-04 12:01:11 113
2013-12-04 13:00:10 113
2013-12-04 13:00:11 114
Upvotes: 2