Reputation: 1946
I'm struggling to understand a problem with numerical integration of a signal. Basically I have a signal which I would like to integrate or perform and antiderivative as a function of time (integration of pick-up coil for getting magnetic field). I've tried two different methods which in principle should be consistent but they are not. The code I'm using is the following. Beware that the signals y in the code has been previously high pass filtered using butterworth filtering (similar to what done here http://wiki.scipy.org/Cookbook/ButterworthBandpass). The signal and time basis can be downloaded here (https://www.dropbox.com/s/fi5z38sae6j5410/trial.npz?dl=0)
import scipy as sp
from scipy import integrate
from scipy import fftpack
data = np.load('trial.npz')
y = data['arr_1'] # this is the signal
t = data['arr_0']
# integration using pfft
bI = sp.fftpack.diff(y-y.mean(),order=-1)
bI2= sp.integrate.cumtrapz(y-y.mean(),x=t)
Now the two signals (besides the eventual different linear trend which can be taken out) are different, or better dynamically they are quite similar with the same time of oscillations but there is a factor approximately of 30 between the two signals, in the sense that bI2 is 30 times (approximately) lower than bI. BTW I've subtracted the mean in both the two signals to be sure that they are zero mean signals and performing integration in IDL (both with equivalent cumsumtrapz and in the fourier domain) gives values compatible with bI2. Any clue is really welcomed
Upvotes: 3
Views: 3591
Reputation: 1521
It's difficult to know what scipy.fftpack.diff()
is doing under the bonnet.
To try and solve your problem, I have dug up an old frequency domain integration function that I wrote a while ago. It's worth pointing out that in practice, one generally wants a bit more control of some of the parameters than scipy.fftpack.diff()
gives you. For example, the f_lo
and f_hi
parameters of my intf()
function allow you to band-limit the input to exclude very low or very high frequencies which may be noisy. Noisy low frequencies in particular can 'blow-up' during integration and overwhelm the signal. You may also want to use a window at the start and end of the time series to stop spectral leakage.
I have calculated bI2
and also a result, bI3
, integrated once with intf()
using the following code (I assumed an average sampling rate for simplicity):
import intf
from scipy import integrate
data = np.load(path)
y = data['arr_1']
t = data['arr_0']
bI2= sp.integrate.cumtrapz(y-y.mean(),x=t)
bI3 = intf.intf(y-y.mean(), fs=500458, f_lo=1, winlen=1e-2, times=1)
I plotted bI2 and bI3:
The two time series are of the same order of magnitude, and broadly the same shape, notwithstanding the piecewise linear trend apparent in bI2. I know this doesn't explain what's going on in the scipy function, but at least this shows it's not a problem with the frequency domain method.
The code for intf
is pasted in full below.
def intf(a, fs, f_lo=0.0, f_hi=1.0e12, times=1, winlen=1, unwin=False):
"""
Numerically integrate a time series in the frequency domain.
This function integrates a time series in the frequency domain using
'Omega Arithmetic', over a defined frequency band.
Parameters
----------
a : array_like
Input time series.
fs : int
Sampling rate (Hz) of the input time series.
f_lo : float, optional
Lower frequency bound over which integration takes place.
Defaults to 0 Hz.
f_hi : float, optional
Upper frequency bound over which integration takes place.
Defaults to the Nyquist frequency ( = fs / 2).
times : int, optional
Number of times to integrate input time series a. Can be either
0, 1 or 2. If 0 is used, function effectively applies a 'brick wall'
frequency domain filter to a.
Defaults to 1.
winlen : int, optional
Number of seconds at the beginning and end of a file to apply half a
Hanning window to. Limited to half the record length.
Defaults to 1 second.
unwin : Boolean, optional
Whether or not to remove the window applied to the input time series
from the output time series.
Returns
-------
out : complex ndarray
The zero-, single- or double-integrated acceleration time series.
Versions
----------
1.1 First development version.
Uses rfft to avoid complex return values.
Checks for even length time series; if not, end-pad with single zero.
1.2 Zero-means time series to avoid spurious errors when applying Hanning
window.
"""
a = a - a.mean() # Convert time series to zero-mean
if np.mod(a.size,2) != 0: # Check for even length time series
odd = True
a = np.append(a, 0) # If not, append zero to array
else:
odd = False
f_hi = min(fs/2, f_hi) # Upper frequency limited to Nyquist
winlen = min(a.size/2, winlen) # Limit window to half record length
ni = a.size # No. of points in data (int)
nf = float(ni) # No. of points in data (float)
fs = float(fs) # Sampling rate (Hz)
df = fs/nf # Frequency increment in FFT
stf_i = int(f_lo/df) # Index of lower frequency bound
enf_i = int(f_hi/df) # Index of upper frequency bound
window = np.ones(ni) # Create window function
es = int(winlen*fs) # No. of samples to window from ends
edge_win = np.hanning(es) # Hanning window edge
window[:es/2] = edge_win[:es/2]
window[-es/2:] = edge_win[-es/2:]
a_w = a*window
FFTspec_a = np.fft.rfft(a_w) # Calculate complex FFT of input
FFTfreq = np.fft.fftfreq(ni, d=1/fs)[:ni/2+1]
w = (2*np.pi*FFTfreq) # Omega
iw = (0+1j)*w # i*Omega
mask = np.zeros(ni/2+1) # Half-length mask for +ve freqs
mask[stf_i:enf_i] = 1.0 # Mask = 1 for desired +ve freqs
if times == 2: # Double integration
FFTspec = -FFTspec_a*w / (w+EPS)**3
elif times == 1: # Single integration
FFTspec = FFTspec_a*iw / (iw+EPS)**2
elif times == 0: # No integration
FFTspec = FFTspec_a
else:
print 'Error'
FFTspec *= mask # Select frequencies to use
out_w = np.fft.irfft(FFTspec) # Return to time domain
if unwin == True:
out = out_w*window/(window+EPS)**2 # Remove window from time series
else:
out = out_w
if odd == True: # Check for even length time series
return out[:-1] # If not, remove last entry
else:
return out
Upvotes: 5