Reputation: 31
I am very new to python. Sorry if my question is stupid..
I am trying to find the lowest value of previous 3 days in a time series.
e.g.
price low
1993-01-29 43.750000 NaN
1993-02-01 43.968700 NaN
1993-02-02 44.125000 43.750000
1993-02-03 42.375000 42.375000
1993-02-04 44.468700 42.375000
I tried .shift(), .min() etc. none of them works. your help is greatly appreciated!
Upvotes: 2
Views: 1064
Reputation: 30424
No need to do this by hand, there is already a function for it:
pd.rolling_min( df['price'], 3 )
1993-01-29 NaN
1993-02-01 NaN
1993-02-02 43.750
1993-02-03 42.375
1993-02-04 42.375
More generally there are a number of rolling-style functions to handle common cases and a rolling_apply
for user functions. Many libraries/packages have these sorts of functions which you can usually find by searching for "moving" or "rolling".
Documentaion for pandas moving/rolling functions
Upvotes: 3
Reputation: 81594
You can try this (of course you can populate the last_x_dates
list with what ever values you need):
import datetime
indexes = [datetime.datetime(2015,1,1), datetime.datetime(2015,1,2), datetime.datetime(2015,1,3), datetime.datetime(2015,1,4)]
df = pd.DataFrame(data={'price' : [12,4,124,555]},index=indexes)
>> price
>> 2015-01-01 12
>> 2015-01-02 4
>> 2015-01-03 124
>> 2015-01-04 555
last_x_dates = [datetime.datetime(2015,1,3), datetime.datetime(2015,1,4)]
df = df[df.index.isin(last_x_dates)]
print min(df['price'])
>> 124
Upvotes: 0
Reputation: 24742
You can try something like this.
import pandas as pd
import numpy as np
# your data
# ===========================
np.random.seed(0)
df = pd.DataFrame(100+np.random.randn(100).cumsum(), index=pd.date_range('2015-01-01', periods=100, freq='B'), columns=['price'])
df
price
2015-01-01 101.7641
2015-01-02 102.1642
2015-01-05 103.1429
2015-01-06 105.3838
2015-01-07 107.2514
2015-01-08 106.2741
2015-01-09 107.2242
2015-01-12 107.0729
2015-01-13 106.9696
2015-01-14 107.3802
... ...
2015-05-07 100.1852
2015-05-08 101.4077
2015-05-11 101.6160
2015-05-12 102.5926
2015-05-13 102.9490
2015-05-14 103.6555
2015-05-15 103.6660
2015-05-18 105.4519
2015-05-19 105.5788
2015-05-20 105.9808
[100 rows x 1 columns]
# processing
# ===========================
pd.concat([df, df.shift(1), df.shift(2)], axis=1).dropna().max(axis=1)
2015-01-05 103.1429
2015-01-06 105.3838
2015-01-07 107.2514
2015-01-08 107.2514
2015-01-09 107.2514
2015-01-12 107.2242
2015-01-13 107.2242
2015-01-14 107.3802
2015-01-15 107.5243
2015-01-16 108.9785
...
2015-05-07 100.5884
2015-05-08 101.4077
2015-05-11 101.6160
2015-05-12 102.5926
2015-05-13 102.9490
2015-05-14 103.6555
2015-05-15 103.6660
2015-05-18 105.4519
2015-05-19 105.5788
2015-05-20 105.9808
Freq: B, dtype: float64
Upvotes: 0