Reputation: 125
I looking to resample my data per minute and find out how many pass fail my data has per minute.
Currently this is what the data looks like:
timeStamp Results
1589443200000 Pass
1589443201000 Fail
1589443202000 Pass
1589443203000 Pass
1589443204000 Pass
1589443321000 Pass
1589443325000 Fail
What im after is something like the following:
time Result Count
8:01:00 Pass 4
8:01:00 Fail 1
8:02:00 Pass 1
8:02:00 Fail 1
Upvotes: 1
Views: 38
Reputation: 862801
First convert unix times to datetimes by to_datetime
and then aggregate counts by GroupBy.size
with Grouper
:
df['timeStamp'] = pd.to_datetime(df['timeStamp'], unit='ms')
df1 = (df.groupby([pd.Grouper(key='timeStamp', freq='Min'), 'Results'])
.size()
.reset_index(name='Count'))
print (df1)
timeStamp Results Count
0 2020-05-14 08:00:00 Fail 1
1 2020-05-14 08:00:00 Pass 4
2 2020-05-14 08:02:00 Fail 1
3 2020-05-14 08:02:00 Pass 1
Or if want Series.dt.floor
and times by Series.dt.time
:
df['timeStamp'] = pd.to_datetime(df['timeStamp'], unit='ms')
df1 = (df.groupby([df['timeStamp'].dt.floor('Min').dt.time, 'Results'])
.size()
.reset_index(name='Count'))
print (df1)
timeStamp Results Count
0 08:00:00 Fail 1
1 08:00:00 Pass 4
2 08:02:00 Fail 1
3 08:02:00 Pass 1
Upvotes: 2