Arman Asryan
Arman Asryan

Reputation: 53

Spliting a dataframe into multiple 5-second dataframes and obtaining count in Python

I have a relatively big dataset that I want to split into multiple dataframes in Python based on a column containing a datetime object. The values in the column (that I want to split the dataframe by) are given in the following format:

  1. 2015-11-01 00:00:05

You may assume the dataframe looks like this.

How can I split the dataframe into 5-second intervals in the following way:

  1. 1st dataframe 2015-11-01 00:00:00 - 2015-11-01 00:00:05,

  2. 2nd dataframe 2015-11-01 00:00:05 - 2015-11-01 00:00:10, and so on.

I also need to count the number of observations in each of resulting dataframes. In other, words, it would be nice if I could get another dataframe with 2 columns (the desired output format can be found below):

enter image description here

Upvotes: 2

Views: 337

Answers (1)

jezrael
jezrael

Reputation: 862641

Create dictionary of DataFrames and add new column with assign:

rng = pd.date_range('2015-11-01 00:00:00', periods=100, freq='S')
df = pd.DataFrame({'Date': rng, 'a': range(100)})  
print (df.head(10))
                 Date  a
0 2015-11-01 00:00:00  0
1 2015-11-01 00:00:01  1
2 2015-11-01 00:00:02  2
3 2015-11-01 00:00:03  3
4 2015-11-01 00:00:04  4
5 2015-11-01 00:00:05  5
6 2015-11-01 00:00:06  6
7 2015-11-01 00:00:07  7
8 2015-11-01 00:00:08  8
9 2015-11-01 00:00:09  9

g = df.groupby(pd.Grouper(key='Date', freq='5S'))

dfs = {k.strftime('%Y-%m-%d %H:%M:%S'):v.assign(A=range(1,len(v)+1), B=len(v)) for k,v in g}

print (dfs['2015-11-01 00:00:05'])
                 Date  a  A  B
5 2015-11-01 00:00:05  5  1  5
6 2015-11-01 00:00:06  6  2  5
7 2015-11-01 00:00:07  7  3  5
8 2015-11-01 00:00:08  8  4  5
9 2015-11-01 00:00:09  9  5  5

If need count rows first aggreagte size and for Interval is add 1 to index:

df1 = df.groupby(pd.Grouper(key='Date', freq='5S')).size().reset_index(name='Count')
df1['Interval'] = df1.index + 1
print (df1.head())
                 Date  Count  Interval
0 2015-11-01 00:00:00      5         1
1 2015-11-01 00:00:05      5         2
2 2015-11-01 00:00:10      5         3
3 2015-11-01 00:00:15      5         4
4 2015-11-01 00:00:20      5         5

Upvotes: 3

Related Questions