Reputation: 51
I am struggling with the correct normalization of the power spectral density (and its inverse).
I am given a real problem, let's say the readings of an accelerometer in the form of the power spectral density (psd) in units of Amplitude^2/Hz. I would like to translate this back into a randomized time series. However, first I want to understand the "forward" direction, time series to PSD.
According to [1], the PSD of a time series x(t) can be calculated by:
PSD(w) = 1/T * abs(F(w))^2 = df * abs(F(w))^2
in which T is the sampling time of x(t) and F(w) is the Fourier transform of x(t) and df=1/T is the frequency resolution in the Fourier space. However, the results I am getting are not equal to what I am getting using the scipy Welch method, see code below.
This first block of code is taken from the scipy.welch documentary:
from scipy import signal
import matplotlib.pyplot as plt
fs = 10e3
N = 1e5
amp = 2*np.sqrt(2)
freq = 1234.0
noise_power = 0.001 * fs / 2
time = np.arange(N) / fs
x = amp*np.sin(2*np.pi*freq*time)
x += np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
f, Pxx_den = signal.welch(x, fs, nperseg=1024)
plt.semilogy(f, Pxx_den)
plt.ylim(\[0.5e-3, 1\])
plt.xlabel('frequency \[Hz\]')
plt.ylabel('PSD \[V**2/Hz\]')
plt.show()
First thing I noticed is that the plotted psd changes with the variable fs which seems strange to me. (Maybe I need to adjust the nperseg argument then accordingly? Why is nperseg not set to fs automatically then?)
My code would be the following: (Note that I defined my own fft_full function which already takes care of the correct fourier transform normalization, which I verified by checking Parsevals theorem).
import scipy.fftpack as fftpack
def fft_full(xt,yt):
dt = xt[1] - xt[0]
x_fft=fftpack.fftfreq(xt.size,dt)
y_fft=fftpack.fft(yt)*dt
return (x_fft,y_fft)
xf,yf=fft_full(time,x)
df=xf[1] - xf[0]
psd=np.abs(yf)**2 *df
plt.figure()
plt.semilogy(xf, psd)
#plt.ylim([0.5e-3, 1])
plt.xlim(0,)
plt.xlabel('frequency [Hz]')
plt.ylabel('PSD [V**2/Hz]')
plt.show()
Unfortunately, I am not yet allowed to post images but the two plots do not look the same!
I would greatly appreciate if someone could explain to me where I went wrong and settle this once and for all :)
[1]: Eq. 2.82. Random Vibrations in Spacecraft Structures Design Theory and Applications, Authors: Wijker, J. Jaap, 2009
Upvotes: 5
Views: 3600
Reputation: 31
The scipy library uses the Welch's method to estimate a PSD. This method is more complex than just taking the squared modulus of the discrete Fourier transform. In short terms, it proceeds as follows:
Let x be the input discrete signal that contains N samples.
Split x into M overlapping segments, such that each segment sm contains nperseg samples and that each two consecutive segments overlap in noverlap samples, so that nperseg = K * (nperseg - noverlap), where K is an integer (usually K = 2). Note also that: N = nperseg + (M - 1) * (nperseg - noverlap) = (M + K - 1) * nperseg / K
From each segment sm, subtract its mean (this removes the DC component): tm = sm - sum(sm) / nperseg
Multiply the elements of the obtained zero-mean segments tm by the elements of a suitable (nonsymmetric) window function, h (such as the Hann window): um = tm * h
Calculate the Fast Fourier Transform of all vectors um. Before performing these transformations, we usually first append so many zeros to each vector um that its new dimension becomes a power of 2 (the nfft argument of the function welch is used for this purpose). Let us suppose that len(um) = 2p. In most cases, our input vectors are real-valued, so it is best to apply FFT for real data. Its results are then complex-valued vectors vm = rfft(um), such that len(vm) = 2p - 1 + 1.
Calculate the squared modulus of all transformed vectors: am = abs(vm) ** 2, or more efficiently: am = vm.real ** 2 + vm.imag ** 2
Normalize the vectors am as follows: bm = am / sum(h * h) bm[1:-1] *= 2 (this takes into account the negative frequencies), where h is a real vector of the dimension nperseg that contains the window coefficients. In case of the Hann window, we can prove that sum(h * h) = 3 / 8 * len(h) = 3 / 8 * nperseg
Estimate the PSD as the mean of all vectors bm: psd = sum(bm) / M The result is a vector of the dimension len(psd) = 2p - 1 + 1. If we wish that the sum of all psd coefficients matches the mean squared amplitude of the windowed input data (rather than the sum of squared amplitudes), then the vector psd must also be divided by nperseg. However, the scipy routine omits this step. In any case, we usually present psd on the decibel scale, so that the final result is: psd_dB = 10 * log10(psd).
For a more detailed description, please read the original Welch's paper. See also Wikipedia's page and chapter 13.4 of Numerical Recipes in C
Upvotes: 2