Pandas - Combining resample value cont

Question

I looking to resample my data per minute and find out how many pass fail my data has per minute.

Currently this is what the data looks like:

timeStamp Results    
1589443200000 Pass       
1589443201000 Fail       
1589443202000 Pass       
1589443203000 Pass       
1589443204000 Pass       
1589443321000 Pass       
1589443325000 Fail

What im after is something like the following:

time       Result     Count      
8:01:00    Pass        4
8:01:00    Fail        1
8:02:00    Pass        1
8:02:00    Fail        1

jezrael · Accepted Answer

First convert unix times to datetimes by to_datetime and then aggregate counts by GroupBy.size with Grouper:

df['timeStamp'] = pd.to_datetime(df['timeStamp'], unit='ms')

df1 = (df.groupby([pd.Grouper(key='timeStamp', freq='Min'), 'Results'])
        .size()
        .reset_index(name='Count'))
print (df1)
            timeStamp Results  Count
0 2020-05-14 08:00:00    Fail      1
1 2020-05-14 08:00:00    Pass      4
2 2020-05-14 08:02:00    Fail      1
3 2020-05-14 08:02:00    Pass      1

Or if want Series.dt.floor and times by Series.dt.time:

df['timeStamp'] = pd.to_datetime(df['timeStamp'], unit='ms')

df1 = (df.groupby([df['timeStamp'].dt.floor('Min').dt.time, 'Results'])
        .size()
       .reset_index(name='Count'))
print (df1)
  timeStamp Results  Count
0  08:00:00    Fail      1
1  08:00:00    Pass      4
2  08:02:00    Fail      1
3  08:02:00    Pass      1

Pandas - Combining resample value cont

Answers (1)

Related Questions