Reputation: 24084
I'm new to Pandas. I have a time series data. How could I do the following operations easily?
I have a 2d matrix called input. Each row has 5 elements. There's lots of rows (thousands)
input[t,:] = [f1, f2, f3, f4, f5]
(1) I need to calculate the relative difference between samples.
i.e. rel[t,:] = ( input[t,:]-input[t-1,:] ) / input[t-1,:]
(2) I need to create a sliding window of size 80.
i.e. win[t,:] = [rel[t,:],rel[t-1,:],...,rel[t-79,:]]
How could I do this in Pandas, or any other framework, such as scikit.timeseries.
Upvotes: 1
Views: 1928
Reputation: 67427
You can do both in plain numpy, although pandas will probably have specific functionality that makes it easier. But:
rel = np.diff(input) / input[:-1]
and
from numpy.lib.stride_tricks import as_strided
win = as_strided(rel, shape=(rel.shape[0]-79, 80), strides=rel.strides*2)
will do it.
If the input has more than one row, you can still do the above as:
rel = np.diff(input, axis=1) / input[:, :-1]
win = as_strided(rel, shape=(rel.shape[0], rel.shape[1]-79, 80),
strides=rel.strides + rel.strides[1:])
although you may want to play around with the 'shape' and matching strides
to get the exact windowed shape you are after.
Upvotes: 2
Reputation: 128928
The docs are quite comprehensive on these types of operations
see:
1) http://pandas.pydata.org/pandas-docs/dev/timeseries.html#time-series-related-instance-methods
2) http://pandas.pydata.org/pandas-docs/dev/computation.html#expanding-window-moment-functions
Upvotes: 2