Reputation: 27
I have a table where in the first column there are integer numbers (7,8,17,467 etc) indicating the seconds and in the other column i have the number of packets delivered in that seconds. I would like to sum all the packets that occurs every second in range of 10 seconds. So i would like to have the number of packets after every 10 seconds for example, in order to have a better visualization of the problem. A problem is that i don't have packets at each second but for example in the second number 5 i don't have packets and the row with the time=5 does not exist.
Anyone have some suggestions?
rpl_dio = data.loc[data['MessageLabel'] == 0]
rpl_dio['Time'] = rpl_dio['Time'].astype(int)
rpl_dio_total = rpl_dio.groupby('Time')['MessageLabel'].count().reset_index(name='PackTime')
rpl_dio_total = rpl_dio_total.sort_values(by='Time',ascending=True)
plt.figure(figsize=(15,9))
plt.plot(rpl_dio_total['Time'],rpl_dio_total['PackTime'])
plt.title( "DIO packets rate" )
plt.ylabel( "Number of packets" )
plt.xlabel( "Time [s]" )
plt.show()
Upvotes: 0
Views: 83
Reputation: 984
I would first add a new column with Timestamp (put your date in), and then combine it with a timedelta of the seconds
df['Seconds'] = pd.Timestamp('2019/01/01 00:00:00') + pd.to_timedelta(df['Time'], unit='s')
Out[61]:
Time PackTime Seconds
0 7 32 2019-01-01 00:00:07
1 9 53 2019-01-01 00:00:09
2 10 34 2019-01-01 00:00:10
3 11 53 2019-01-01 00:00:11
4 12 34 2019-01-01 00:00:12
and set the 'Seconds'
column as your index
df.set_index('Seconds', inplace=True)
Out[62]:
Time PackTime
Seconds
2019-01-01 00:00:07 7 32
2019-01-01 00:00:09 9 53
2019-01-01 00:00:10 10 34
2019-01-01 00:00:11 11 53
2019-01-01 00:00:12 12 34
now you can use the resample()
method where '10S'
is 10 seconds
df['PackTime'].resample('10S').sum()
Out[63]:
Seconds
2019-01-01 00:00:00 85
2019-01-01 00:00:10 121
Freq: 10S, Name: PackTime, dtype: int64
Upvotes: 2
Reputation: 2032
Try below:
pd.cut(df.Time, bins=np.arange(0, 100, 10)).groupby('Time').count()
Upvotes: 0