jay
jay

Reputation: 77

Fourier Transform Time Series in Python

I've got a time series of sunspot numbers, where the mean number of sunspots is counted per month, and I'm trying to use a Fourier Transform to convert from the time domain to the frequency domain. The data used is from https://wwwbis.sidc.be/silso/infosnmtot. The first thing I'm confused about is how to express the sampling frequency as once per month. Do I need to convert it to seconds, eg. 1/(seconds in 30 days)? Here's what I've got so far:

fs = 1/2592000
#the sampling frequency is 1/(seconds in a month)

fourier = np.fft.fft(sn_value)
#sn_value is the mean number of sunspots measured each month
freqs = np.fft.fftfreq(sn_value.size,d=fs)

power_spectrum = np.abs(fourier)

plt.plot(freqs,power_spectrum)

plt.xlim(0,max(freqs))
plt.title("Power Spectral Density of the Sunspot Number Time Series")
plt.grid(True)

Here is the graph output

I don't think this is correct - namely because I don't know what the scale of the x-axis is. However I do know that there should be a peak at (11years)^-1.

The second thing I'm wondering from this graph is why there seems to be two lines - one being a horizontal line just above y=0. It's more clear when I change the x-axis bounds to: plt.xlim(0,1).

Graph when plt.xlim(0,1)

Am I using the fourier transform functions incorrectly?

Upvotes: 7

Views: 11234

Answers (2)

mins
mins

Reputation: 7484

To set the step for frequencies to one year, determine how many samples represent this duration (12), then take the inverse for parameter d:

freqs = sf.rfftfreq(n=N, d=1/12)

Also, you can improve the frequency scale in some ways:

  • Use the period instead of the frequency
  • Reverse the scale direction to have small periods on the left.
  • Limit the range of frequencies plotted (e.g. first 50 components) in order to see details for components with longer periods. The two cycles are at 11 and 80-90 years.

enter image description here

import numpy as np
import pandas as pd
import scipy.fft as sf
import matplotlib.pyplot as plt
import matplotlib.ticker as tck

# Read values, select columns, convert to numpy array
df = pd.read_excel(fp)
df = df.take([3, 1, 4], axis=1)
data = df.to_numpy()

# Sort by date, extract columns in invidual views, remove DC offset
data = data[data[:,0].argsort()]
year = data[:,1]
spots = data[:,2]
spots = spots - spots.mean()

# Get positive DFT of AQI
N = year.shape[0]
X = sf.rfft(spots) / N
freqs = sf.rfftfreq(n=N, d=1/12) # unit = 1/12 of sampling period

# Plot signal
fig, axes = plt.subplots(figsize=(15,3), ncols=2)
ax=axes[0]
ax.plot(year, spots)
ax.xaxis.set_major_locator(tck.MultipleLocator(50))
#ax.grid()

# Plot DFT
ax=axes[1]
extent = 50#N
ax.set_xlabel('period, years')
ax.stem(freqs[:extent], abs(X[:extent]))
ticks = ax.get_xticks()
ax.set_xticklabels([f'{1/tick:.1f}' if tick!=0 else '$\infty$' for tick in ticks])
ax.invert_xaxis()
ax.grid()

Upvotes: 1

Cris Luengo
Cris Luengo

Reputation: 60444

You can use any units you want. Feel free to express your sampling frequency as fs=12 (samples/year), the x-axis will then be 1/year units. Or use fs=1 (sample/month), the units will then be 1/month.

The extra line you spotted comes from the way you plot your data. Look at the output of the np.fft.fftfreq call. The first half of that array contains positive values from 0 to 1.2e6 or so, the other half contain negative values from -1.2e6 to almost 0. By plotting all your data, you get a data line from 0 to the right, then a straight line from the rightmost point to the leftmost point, then the rest of the data line back to zero. Your xlim call makes it so you don’t see half the data plotted.

Typically you’d plot only the first half of your data, just crop the freqs and power_spectrum arrays.

Upvotes: 4

Related Questions