Reputation: 1
I have two separate databases - a temperature db with hourly data and a house db with minute by minute data for hvac usage. I'm trying to plot the hvac data as a temperature series over a week, a month, and a year, but since the increments don't match the temperature db, I'm having trouble. I've tried making a least squares fit, but a) can't figure out how to do one in pandas and b) that gets really inaccurate after a day or two. Any suggestions?
Upvotes: 0
Views: 453
Reputation: 54380
pandas
timeseries
is prefect for this application. You can merge series of different sample frequency and pandas
will align them perfectly. Then you can downsample the data and preform regression, i.e., with statsmodels
. An mock-up example:
In [288]:
idx1=pd.date_range('2001/01/01', periods=10, freq='D')
idx2=pd.date_range('2001/01/01', periods=500, freq='H')
df1 =pd.DataFrame(np.random.random(10), columns=['val1'])
df2 =pd.DataFrame(np.random.random(500), columns=['val2'])
df1.index=idx1
df2.index=idx2
In [291]:
df3=pd.merge(df1, df2, left_index=True, right_index=True, how='inner')
df4=df3.resample(rule='D')
In [292]:
print df4
val1 val2
2001-01-01 0.399901 0.244800
2001-01-02 0.014448 0.423780
2001-01-03 0.811747 0.070047
2001-01-04 0.595556 0.679096
2001-01-05 0.218412 0.116764
2001-01-06 0.961310 0.040317
2001-01-07 0.058964 0.606843
2001-01-08 0.075129 0.407842
2001-01-09 0.833003 0.751287
2001-01-10 0.070072 0.559986
[10 rows x 2 columns]
In [294]:
import statsmodels.formula.api as smf
mod = smf.ols(formula='val1 ~ val2', data=df4)
res = mod.fit()
print res.summary()
OLS Regression Results
==============================================================================
Dep. Variable: val1 R-squared: 0.061
Model: OLS Adj. R-squared: -0.056
Method: Least Squares F-statistic: 0.5231
Date: Fri, 27 Jun 2014 Prob (F-statistic): 0.490
Time: 10:46:34 Log-Likelihood: -3.3643
No. Observations: 10 AIC: 10.73
Df Residuals: 8 BIC: 11.33
Df Model: 1
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
Intercept 0.5405 0.224 2.417 0.042 0.025 1.056
val2 -0.3502 0.484 -0.723 0.490 -1.467 0.766
==============================================================================
Omnibus: 3.509 Durbin-Watson: 2.927
Prob(Omnibus): 0.173 Jarque-Bera (JB): 1.232
Skew: 0.399 Prob(JB): 0.540
Kurtosis: 1.477 Cond. No. 4.69
==============================================================================
Upvotes: 3