siamii
siamii

Reputation: 24084

Pandas time series operations

I'm new to Pandas. I have a time series data. How could I do the following operations easily?

I have a 2d matrix called input. Each row has 5 elements. There's lots of rows (thousands)

input[t,:] = [f1, f2, f3, f4, f5]

(1) I need to calculate the relative difference between samples.

i.e. rel[t,:] = ( input[t,:]-input[t-1,:] ) / input[t-1,:]

(2) I need to create a sliding window of size 80.

i.e. win[t,:] = [rel[t,:],rel[t-1,:],...,rel[t-79,:]]

How could I do this in Pandas, or any other framework, such as scikit.timeseries.

Upvotes: 1

Views: 1928

Answers (2)

Jaime
Jaime

Reputation: 67427

You can do both in plain numpy, although pandas will probably have specific functionality that makes it easier. But:

rel = np.diff(input) / input[:-1]

and

from numpy.lib.stride_tricks import as_strided
win = as_strided(rel, shape=(rel.shape[0]-79, 80), strides=rel.strides*2)

will do it.


If the input has more than one row, you can still do the above as:

rel = np.diff(input, axis=1) / input[:, :-1]
win = as_strided(rel, shape=(rel.shape[0], rel.shape[1]-79, 80),
                 strides=rel.strides + rel.strides[1:])

although you may want to play around with the 'shape' and matching strides to get the exact windowed shape you are after.

Upvotes: 2

Related Questions