Reputation: 373
I want to use an array and its first derivative (diff) as features for training. Since the diff array is of an smaller size I would like to fill it up so that I don't have problems with sizes when I stack them and use both as features.
If I fill the diff(array) with a 0, How should I align them? Do I put the 0 at the beginning of the resulting diff(array) or at the end? What is the correct way of aligning an array with its derivative? e.g. in python:
a = [1,32,43,54]
b = np.diff(np.array(a))
np.insert(b, -1, 0) # at the end?
np.insert(b, 0, 0) # or at the beginning?
Upvotes: 3
Views: 516
Reputation: 114841
Instead of left- or right-sided finite differences, you could use a centered finite difference (which is equivalent to taking the average of the left- and right-sided differences), and then pad both ends with appropriate approximations of the derivatives there. This will keep the estimation of the derivative aligned with its data value, and usually give a better estimate of the derivative.
For example,
In [33]: y = np.array([1, 2, 3.5, 3.5, 4, 3, 2.5, 1.25])
In [34]: dy = np.empty(len(y))
In [35]: dy[1:-1] = 0.5*(y[2:] - y[:-2])
In [36]: dy[0] = y[1] - y[0]
In [37]: dy[-1] = y[-1] - y[-2]
In [38]: dy
Out[38]: array([ 1. , 1.25 , 0.75 , 0.25 , -0.25 , -0.75 , -0.875, -1.25 ])
The following script using matplotlib to create this visualization of the estimates of the derivatives:
import numpy as np
import matplotlib.pyplot as plt
y = np.array([1, 2, 3.5, 3.5, 4, 3, 2.5, 1.25])
dy = np.empty(len(y))
dy[1:-1] = 0.5*(y[2:] - y[:-2])
dy[0] = y[1] - y[0]
dy[-1] = y[-1] - y[-2]
plt.plot(y, 'b-o')
for k, (y0, dy0) in enumerate(zip(y, dy)):
t = 0.25
plt.plot([k-t, k+t], [y0 - t*dy0, y0 + t*dy0], 'c', alpha=0.4, linewidth=4)
plt.grid()
plt.show()
There are more sophisticated tools for estimating derivatives (e.g. scipy.signal.savgol_filter
has an option for estimating the derivative, and if your data is periodic, you could use scipy.fftpack.diff
), but a simple finite difference might work fine as your training input.
Upvotes: 2
Reputation: 466
According to the documentation, diff is simply doing out[n] = a[n+1] - a[n]
. This means that it is not a derivative approximated by finite difference, but the discrete difference. To calculate the finite difference, you need to divide by the step size,
except if your step size is 1, of course. Example:
import numpy as np
x = np.linspace(0,2*np.pi,30)
y = np.sin(x)
dy = np.diff(y) / np.diff(x)
Here, y
is a function of x
at specific points, dy
is it's derivative. The derivative by this formula is a central derivative, meaning that its location is between the points in x
. If you need the derivatives at the same points, I would suggest you to calculate the derivative using the two neighbouring points:
(y[:-2]-y[2:])/(x[:-2]-x[2:])
This way, you could add a 0
to both sides of the derivative vector, or trim you input vector accordingly.
Upvotes: 1