Reputation: 23
I was trying to run a Least Absolute Deviance regression (L1 regression). I did it by designing a linear program, and solving with CVXOPT (in Python). The number of features (including constant) was 13 and the number of samples was ~100K. It took many hours to run to completion.
Then I realized that I can run statsmodels' QuantReg
with q=0.5
. it took a few seconds.
I read that they use Iteratively reweighted least squares (IRLS).
Is IRLS faster than linear programming for this purpose, or maybe a different formulation of the linear optimization problem, and a use on a better solver like Gurobi, would reduce the running time of the linear program substantially?
The reason I am asking is because I want to try a loss function which involves a summation of two different absolute functions, with different weights. I believe that I can employ IRLS for that problem as well, but I am not sure, and I am more comfortable with linear programming.
The formulation I used for the linear program was as follows:
The resulting code for CVXOPT was as follows:
def l1_fit(x,y):
'''
:param x: feature matrix
:param y: label vector
:return: LAD coefficients
The problem I construct is:
min c'*(v|A)
subject to:
(-I|x)*(v|A)<=y
(-I|-x)*(v|A)<=-y
'''
c = matrix(np.concatenate((np.ones(y.shape[0]),np.zeros(x.shape[1]))).astype(float).tolist(),
(y.shape[0]+x.shape[1],1),'d')
G = scipy_sparse_to_spmatrix(np.transpose(hstack((vstack((-identity(y.shape[0]),np.transpose(x))),
vstack((-identity(y.shape[0]),np.transpose(-x))))).astype(float)))
h = matrix(np.concatenate((y,-y)).astype(float).tolist(),(2*y.shape[0],1),'d')
sol = lp(c, G, h)
return sol[1][y.shape[0]:]
Upvotes: 2
Views: 521
Reputation: 4963
If you have access to Gurobi or CPlex, then you should follow the analysis in Erwin Kalvelagen - Linear Programming and LAD Regression.
It shows that the best formulation of the problem as Linear Programming is given by:
Yet, I'm not sure converting the model into an LP problem is the way to go. I generated a model matrix $\boldsymbol{A} \in \mathbb{R}^{100,000 \times 15}$.
I solved the problem using Convex.jl
(Equivalent of CVXPY
for Julia) with the ECOS Solver (Should be available in CVXPY
) in its direct form.
The problem was solved in under 50 [Sec]
. So about 1-2 orders of magnitude faster.
Moreover, I implemented a solver based on Iteratively Reweighted Least Squares (IRLS). I could solve the problem in less than a second.
The Convex.jl (ECOS) solution run time: 44.50541 [Sec]
The IRLS solution run time: 0.22733 [Sec]
The cause for the speed is that the IRLS most consuming operation, solving a linear system, is done on A' * A
which in your case is pretty small.
I implemented it in Julia following:
The code is implemented in Julia with a relatively efficient IRLS code I have written (No allocations).
You could probably imitate it pretty closely in Python using NumPy
, SciPy
and Numba
.
So solving your problem is pretty much an hour of coding away.
The code is available on my StackOverflow GitHub Repository (Look at the StackOverflow\Q64422417
folder).
Upvotes: 0