pjw
pjw

Reputation: 2325

Scikit-Learn or other Python tools for parameter optimization

I have a simple model with two parameters I need to "tune". Using the parameters 'a' and 'b', the model equation is:

model = (a * temp) + (b * rad)

temp and rad are measured datasets (in this case, temperature and radiation). These datasets are Pandas DateTime-indexed Series, at one-day (24-hr) frequency.

temp data looks like this:

TIMESTAMP
2014-07-17    1.399556
2014-07-18    1.492743
2014-07-19    1.865306
2014-07-20    2.478098
                ...   
2016-08-23    2.327437
2016-08-24    3.065250
2016-08-25    2.427021
2016-08-26    1.365833
Name: AirTC_2, Length: 213, dtype: float64

rad data looks like this:

TIMESTAMP
2014-07-17    2292.717541
2014-07-18    2228.255459
2014-07-19    2166.962811
2014-07-20    2803.802975
                 ...     
2016-08-23     696.327810
2016-08-24    1431.858289
2016-08-25    1083.182916
2016-08-26     542.908838
Name: CNR_Wm2, Length: 213, dtype: float64

I also have a measured dataset that the model is attempting to approximate. The measured dataset looks like this:

TIMESTAMP
2014-07-17    0.036750
2014-07-18    0.045892
2014-07-19    0.041919
2014-07-20    0.044640
            ...   
2016-08-23    0.029696
2016-08-24    0.033997
2016-08-25    0.032872
2016-08-26    0.012204
Name: melt_sonic, Length: 213, dtype: float64

I have done a preliminary optimization of the model parameters using standard regression techniques: minimizing the sum of the squared difference (error) between model and measured. I tested a range of parameter space for both a and b, running the model for 10,000 unique parameter combinations (where the array length for both a and b is 100).

a = np.arange(0.00000009,0.00001,0.0000001)   
b = np.arange(0.0115,0.0125,0.00001)

I am simply coding the math to do this analysis, and I would like to double-check my results by independently optimizing the parameters using a package method from an appropriate library.

What is the most appropriate method to optimize these parameters using Scikit-Learn or another Python library?

Upvotes: 1

Views: 178

Answers (1)

TomDLT
TomDLT

Reputation: 4485

This is called "linear regression" and you don't need to try different combination of the parameters to find the good ones. One can analytically solve this problem applying a direct mathematical formula, so you don't even need to guess the range of good parameter.

In term of code, you can use scikit-learn's LinearRegression estimator:

from sklearn.linear_model import LinearRegression

X = pd.concat([rad, temp], axis=1)  # the input of the model
y = measured  # the output of the model

estimator = LinearRegression()  # create the estimator object
estimator.fit(X, y)  # optimize the parameters of the model on the data
a, b = estimator.coef_  # the obtained parameters

For more information, see for example this example for a tutorial on linear regression.

Upvotes: 1

Related Questions