Reputation: 2325
I have a simple model with two parameters I need to "tune". Using the parameters 'a' and 'b', the model equation is:
model = (a * temp) + (b * rad)
temp
and rad
are measured datasets (in this case, temperature and radiation). These datasets are Pandas DateTime-indexed Series, at one-day (24-hr) frequency.
temp
data looks like this:
TIMESTAMP
2014-07-17 1.399556
2014-07-18 1.492743
2014-07-19 1.865306
2014-07-20 2.478098
...
2016-08-23 2.327437
2016-08-24 3.065250
2016-08-25 2.427021
2016-08-26 1.365833
Name: AirTC_2, Length: 213, dtype: float64
rad
data looks like this:
TIMESTAMP
2014-07-17 2292.717541
2014-07-18 2228.255459
2014-07-19 2166.962811
2014-07-20 2803.802975
...
2016-08-23 696.327810
2016-08-24 1431.858289
2016-08-25 1083.182916
2016-08-26 542.908838
Name: CNR_Wm2, Length: 213, dtype: float64
I also have a measured dataset that the model is attempting to approximate. The measured
dataset looks like this:
TIMESTAMP
2014-07-17 0.036750
2014-07-18 0.045892
2014-07-19 0.041919
2014-07-20 0.044640
...
2016-08-23 0.029696
2016-08-24 0.033997
2016-08-25 0.032872
2016-08-26 0.012204
Name: melt_sonic, Length: 213, dtype: float64
I have done a preliminary optimization of the model parameters using standard regression techniques: minimizing the sum of the squared difference (error) between model
and measured
. I tested a range of parameter space for both a
and b
, running the model for 10,000 unique parameter combinations (where the array length for both a
and b
is 100).
a = np.arange(0.00000009,0.00001,0.0000001)
b = np.arange(0.0115,0.0125,0.00001)
I am simply coding the math to do this analysis, and I would like to double-check my results by independently optimizing the parameters using a package method from an appropriate library.
What is the most appropriate method to optimize these parameters using Scikit-Learn or another Python library?
Upvotes: 1
Views: 178
Reputation: 4485
This is called "linear regression" and you don't need to try different combination of the parameters to find the good ones. One can analytically solve this problem applying a direct mathematical formula, so you don't even need to guess the range of good parameter.
In term of code, you can use scikit-learn's LinearRegression
estimator:
from sklearn.linear_model import LinearRegression
X = pd.concat([rad, temp], axis=1) # the input of the model
y = measured # the output of the model
estimator = LinearRegression() # create the estimator object
estimator.fit(X, y) # optimize the parameters of the model on the data
a, b = estimator.coef_ # the obtained parameters
For more information, see for example this example for a tutorial on linear regression.
Upvotes: 1