James
James

Reputation: 113

Lowess Smoothing of Time Series data python

I am trying to use LOWESS to smooth the following data:

Time series data

I would like to obtain a smooth line that filters out the spikes in the data. My code is as follows:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import HourLocator, DayLocator, DateFormatter
from statsmodels.nonparametric.smoothers_lowess import lowess

file = r'C:...'
df = pd.read_csv(file) # reads data file   

df['Date'] = pd.to_datetime(df['Time Local'], format='%d/%m/%Y  %H:%M')     

x = df['Date']  
y1 = df['CTk2 Level'] 

filtered = lowess(y1, x, is_sorted=True, frac=0.025, it=0)

plt.plot(x, y1, 'r')
plt.plot(filtered[:,0], filtered[:,1], 'b')

plt.show()

When I run this code, I get the following error:

ValueError: view limit minimum -7.641460199922635e+16 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a non-datetime value to an axis that has datetime units

The date in my data is in the format 07/05/2018 00:07:00. I think the issue is that the LOWESS is struggling to work with the datetime data, but not sure?

Can you please help me?

Upvotes: 10

Views: 8631

Answers (1)

chthonicdaemon
chthonicdaemon

Reputation: 19810

Lowess doesn't respect the DateTimeIndex type and instead just returns the dates as nanoseconds since epoch. Luckily it is easy to convert back:

smoothedx, smoothedy = lowess(y1, x, is_sorted=True, frac=0.025, it=0)
smoothedx = smoothedx.astype('datetime64[s]')

Upvotes: 6

Related Questions