Reputation: 5527
I have a pandas dataframe with datetime index
Date
2013-02-22 00:00:00+00:00 0.280001
2013-02-25 00:00:00+00:00 0.109999
2013-02-26 00:00:00+00:00 -0.150000
2013-02-27 00:00:00+00:00 0.130001
2013-02-28 00:00:00+00:00 0.139999
Name: MOM12
and want to evaluate the previous three values of the given datetime index.
date = "2013-02-27 00:00:00+00:00"
df.ix[date]
I searched for this but since my index is a date I can't do
df.ix[int-1]
Upvotes: 30
Views: 30764
Reputation: 31
I had the same problem and thanks to Andy Hayden's solution, I got it working for iterating over rows of a DataFrame with a DatetimeIndex. So I threw it in a small function. It can be used to get previous or future values. If the index doesn't go out of bounds.
def get_row(df, row, n = 0, value = None):
loc = df.index.get_loc(row[0])
if value == None:
return df.iloc[loc + n]
else:
return df.iloc[loc + n][value]
So while iterating over the rows, you can call this function.
for row in df.itertuples():
# Get past value of a whole row
get_row(df, row, -1)
# Get past value of a certain column of a row
get_row(df, row, -1, "column_name")
# Get future value of a certain column of a row
get_row(df, row, 1, "column_name")
# Can be used to get the current row but this is slower than the following function
# Slower
get_row(df, row, 0, "column_name")
# Faster
row[data.columns.get_loc("column_name") + 1]
Upvotes: 1
Reputation: 4233
use shift to get the previous row values
data=[('2013-02-22 00:00:00+00:00', 0.280001)
,('2013-02-25 00:00:00+00:00', 0.109999)
,('2013-02-26 00:00:00+00:00', -0.150000)
,('2013-02-27 00:00:00+00:00', 0.130001)
,('2013-02-28 00:00:00+00:00', 0.139999)]
df=pd.DataFrame(data=data,columns=['date','value'])
df['date']=pd.to_datetime(df['date'])
df['p_value']=df.value.shift(1)
df['pp_value']=df.value.shift(2)
df['ppp_value']=df.value.shift(3)
print(df)
output
date value p_value pp_value ppp_value
0 2013-02-22 00:00:00+00:00 0.280001 NaN NaN NaN
1 2013-02-25 00:00:00+00:00 0.109999 0.280001 NaN NaN
2 2013-02-26 00:00:00+00:00 -0.150000 0.109999 0.280001 NaN
3 2013-02-27 00:00:00+00:00 0.130001 -0.150000 0.109999 0.280001
4 2013-02-28 00:00:00+00:00 0.139999 0.130001 -0.150000 0.109999
Upvotes: 0
Reputation: 375377
Here's one way to do it, first grab the integer location of the index key via get_loc
:
In [15]: t = pd.Timestamp("2013-02-27 00:00:00+00:00")
In [16]: df1.index.get_loc(t)
Out[16]: 3
And then you can use iloc
(to get the integer location, or slice by integer location):
In [17]: loc = df1.index.get_loc(t)
In [18]: df.iloc[loc - 1]
Out[18]:
Date 2013-02-26 00:00:00
-0.15
Name: 2, Dtype: object
In [19]: df1.iloc[slice(max(0, loc-3), min(loc, len(df)))]
# the min and max feel slightly hacky (!) but needed incase it's within top or bottom 3
Out[19]:
Date
2013-02-22 0.280001
2013-02-25 0.109999
2013-02-26 -0.150000
See the indexing section of the docs.
I'm not quite sure how you set up your DataFrame, but that doesn't look like a Datetime Index to me. Here's how I got the DataFrame (with Timestamp index):
In [11]: df = pd.read_clipboard(sep='\s\s+', header=None, parse_dates=[0], names=['Date', None])
In [12]: df
Out[12]:
Date
0 2013-02-22 00:00:00 0.280001
1 2013-02-25 00:00:00 0.109999
2 2013-02-26 00:00:00 -0.150000
3 2013-02-27 00:00:00 0.130001
4 2013-02-28 00:00:00 0.139999
In [13]: df1 = df.set_index('Date')
In [14]: df1
Out[14]:
Date
2013-02-22 0.280001
2013-02-25 0.109999
2013-02-26 -0.150000
2013-02-27 0.130001
2013-02-28 0.139999
Upvotes: 28