Reputation: 849
I have dataframe with timestamp as index and price values as column. When I try to plot using plot_acf
, the x-axis is starting from 1970
.
Code:
import pandas as pd
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
data = {'price_btc': {Timestamp('2017-04-04 00:00:00'): 1132.0,
Timestamp('2017-04-05 00:00:00'): 1142.0,
Timestamp('2017-04-06 00:00:00'): 1128.0,
Timestamp('2017-04-07 00:00:00'): 1164.0,
Timestamp('2017-04-08 00:00:00'): 1189.0,
Timestamp('2017-04-09 00:00:00'): 1188.0,
Timestamp('2017-04-10 00:00:00'): 1194.0,
Timestamp('2017-04-11 00:00:00'): 1208.0,
Timestamp('2017-04-12 00:00:00'): 1213.0,
Timestamp('2017-04-13 00:00:00'): 1218.0}}
df = pd.DataFrame(data)
# Original Series
fig, axes = plt.subplots(3, 2, sharex=True, figsize=(20, 5))
axes[0, 0].plot(df.price_btc); axes[0, 0].set_title('Original Series')
plot_acf(df.price_btc, ax=axes[0, 1])
# 1st Differencing
axes[1, 0].plot(df.price_btc.diff()); axes[1, 0].set_title('1st Order Differencing')
plot_acf(df.price_btc.diff().dropna(), ax=axes[1, 1])
# 2nd Differencing
axes[2, 0].plot(df.price_btc.diff().diff()); axes[2, 0].set_title('2nd Order Differencing')
plot_acf(df.price_btc.diff().diff().dropna(), ax=axes[2, 1])
plt.show()
Expected output is to have dates from 2017 in Autocorrelation plots as well.
Issue is with sharex=True
. Need to have different x lables for autocorrelation plot.
Upvotes: 2
Views: 1705
Reputation: 923
First of all, you do not need .diff().diff()
for second order differencing. You can give a integer as order to the .diff()
method, so second order would be .diff(2)
.
If you set sharex = True
then all x-axis will be the same. The autocorrelation plot is not scaled by dates or so but by lags. Set sharex = False
and add this line here for each plot to set the x axis:
axes[0, 0].set_xlim(dt.datetime(2017,4,4), dt.datetime(2017,4,13))
You can set the start and end date then. Here is the full code:
import pandas as pd
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
import matplotlib.pyplot as plt
import datetime as dt
import matplotlib.dates as mdates
data = {'price_btc': {pd.Timestamp('2017-04-04 00:00:00'): 1132.0,
pd.Timestamp('2017-04-05 00:00:00'): 1142.0,
pd.Timestamp('2017-04-06 00:00:00'): 1128.0,
pd.Timestamp('2017-04-07 00:00:00'): 1164.0,
pd.Timestamp('2017-04-08 00:00:00'): 1189.0,
pd.Timestamp('2017-04-09 00:00:00'): 1188.0,
pd.Timestamp('2017-04-10 00:00:00'): 1194.0,
pd.Timestamp('2017-04-11 00:00:00'): 1208.0,
pd.Timestamp('2017-04-12 00:00:00'): 1213.0,
pd.Timestamp('2017-04-13 00:00:00'): 1218.0}}
df = pd.DataFrame(data)
# Original Series
fig, axes = plt.subplots(3, 2, sharex=False, figsize=(20, 5))
axes[0, 0].plot(df.price_btc); axes[0, 0].set_title('Original Series')
axes[0, 0].set_xlim(dt.datetime(2017,4,4), dt.datetime(2017,4,13))
plot_acf(df.price_btc, ax=axes[0, 1])
# 1st Differencing
axes[1, 0].plot(df.price_btc.diff()); axes[1, 0].set_title('1st Order Differencing')
axes[1, 0].set_xlim(dt.datetime(2017,4,4), dt.datetime(2017,4,13))
plot_acf(df.price_btc.diff(1).dropna(), ax=axes[1, 1])
# 2nd Differencing
axes[2, 0].plot(df.price_btc.diff().diff()); axes[2, 0].set_title('2nd Order Differencing')
axes[2, 0].set_xlim(dt.datetime(2017,4,4), dt.datetime(2017,4,13))
plot_acf(df.price_btc.diff(2).dropna(), ax=axes[2, 1])
plt.show()
Upvotes: 1