I have a classic linear regression problem of the form:
y = X b
where y
is a response vector X
is a matrix of input variables and b
is the vector of fit parameters I am searching for.
Python provides b = numpy.linalg.lstsq( X , y )
for solving problems of this form.
However, when I use this I tend to get either extremely large or extremely small values for the components of b
.
I'd like to perform the same fit, but constrain the values of b
between 0 and 255.
It looks like scipy.optimize.fmin_slsqp()
is an option, but I found it extremely slow for the size of problem I'm interested in (X
is something like 3375 by 1500
and hopefully even larger).
b
coefficient values?You mention you would find Lasso Regression or Ridge Regression acceptable. These and many other constrained linear models are available in the scikit-learn package. Check out the section on generalized linear models.
Usually constraining the coefficients involves some kind of regularization parameter (C or alpha)---some of the models (the ones ending in CV) can use cross validation to automatically set these parameters. You can also further constrain models to use only positive coefficents---for example, there is an option for this on the Lasso model.