Reputation: 475
I have an example data of every minute as below:
datetime value
2021-04-10 00:01:00+00:00. 0
2021-04-10 00:02:00+00:00. 0
2021-04-10 00:03:00+00:00. 0
2021-04-10 00:04:00+00:00. 1
2021-04-10 00:05:00+00:00. 0
2021-04-10 00:06:00+00:00. 1
2021-04-10 00:07:00+00:00. 0
2021-04-10 00:08:00+00:00. 1
2021-04-10 00:09:00+00:00. 1
I would like to create another column(expected) with a logic which samples the data every 3 minutes and : a) assign 0 to the new column when at least three of the sampled values are 0 b) and assigns 1 when you have less than three sampled values with 0
The expected output should be like this:
datetime value. expected
2021-04-10 00:03:00+00:00. [0,0,0] 0
2021-04-10 00:06:00+00:00. [1,0, 1]. 1
2021-04-10 00:09:00+00:00. [0,1,1]. 1
Upvotes: 0
Views: 34
Reputation: 863301
First convert values to datetimes and then use DataFrame.resample
by 3 minutes with convert values to lists and last value of datetime, then check if at least one 1
by any
with convert column to integers:
df['datetime'] = pd.to_datetime(df['datetime'].replace('\.','', regex=True))
df = (df.resample('3Min', on='datetime', closed='right')
.agg({'value':list, 'datetime':'last'})
.reset_index(drop=True))
df = df[['datetime','value']]
df['expected'] = df['value'].apply(any).astype(int)
print (df)
datetime value expected
0 2021-04-10 00:03:00+00:00 [0, 0, 0] 0
1 2021-04-10 00:06:00+00:00 [1, 0, 1] 1
2 2021-04-10 00:09:00+00:00 [0, 1, 1] 1
Upvotes: 1