I came across the basin hopping algorithm in scipy and created a simple problem to understand how to use it but it doesnt seem to be working correctly for that problem. May be I'm doing something completely wrong.
Here is the code:
import scipy.optimize as spo
import numpy as np
minimizer_kwargs = {"method":"BFGS"}
f1=lambda x: (x-4)
def mybounds(**kwargs):
x = kwargs["x_new"]
tmax = bool(np.all(x <= 1.0))
tmin = bool(np.all(x >= 0.0))
print x
print tmin and tmax
return tmax and tmin
def print_fun(x, f, accepted):
print("at minima %.4f accepted %d" % (f, int(accepted)))
x0=[1.]
spo.basinhopping(f1,x0,accept_test=mybounds,callback=print_fun,niter=200,minimizer_kwargs=minimizer_kwargs)
The solution it gives is x: array([ -1.80746874e+08])
The function you are testing makes use of an approach called Metropolis-Hastings, which can be modified into a procedure called simulated annealing that can optimze functions in a stochastic way.
The way this works is as follows. First you pick a point, like your point x0
. From that point, you generate a random perturbation (this is called a "proposal"). Once there is a proposed perturbation, you get your candidate for a new point by applying the perturbation to your current out. So, you could think of it like x1 = x0 + perturbation
.
In regular old gradient descent, the perturbation
term is just a deterministically calculated quantity, like a step in the direction of the gradient. But in Metropolis-Hastings, perturbation
is generated randomly (sometimes by using the gradient as a clue about where to randomly go... but sometimes just randomly with no clues).
At this point when you've got x1
, you have to ask yourself: "Did I do a good thing by randomly perturbing x0
or did I just mess everything up?" One part of that has to do with sticking inside of some bounds, such as your mybounds
function. Another part of it has to do with how much better/worse the objective function's value has become at the new point.
So there are two ways to reject the proposal of x1
: first, it could violate the bounds you have set and be an infeasible point by the problem's definition; second, from the accept/reject assessment step of Metropolis-Hastings, it could just be a very bad point that should be rejected. In either case, you will reject x1
and instead set x1 = x0
and pretend as if you just stayed in the same spot to try again.
Contrast that with a gradient-type method, where you'll definitely, no matter what, always make at least some kind of movement (a step in the direction of the gradient).
Whew, all right. With that all aside, let's think about how this plays out with the basinhopping
function. From the documentation we can see that the typical acceptance condition is accessed by the take_step
argument, and the documentation saying this: "The default step taking routine is a random displacement of the coordinates, but other step taking algorithms may be better for some systems." So, even apart from your mybounds
bounds checker, the function will be making a random displacement of the coordinates to generate its new point to try. And since the gradient of this function is just the constant 1
, it will always be making the same big steps in the direction of negative gradient (for minimizing).
At a practical level, this means that the proposed points for x1
are always going to be quite outside of the interval [0,1]
and your bounds checker will always veto them.
When I run your code, I see this happening all the time:
In [5]: spo.basinhopping(f1,x0,accept_test=mybounds,callback=print_fun,niter=200,minimizer_kwargs=minimizer_kwargs)
at minima -180750994.1924 accepted 0
[ -1.80746874e+08]
False
at minima -180746877.5530 accepted 0
[ -1.80746873e+08]
False
at minima -180746877.3896 accepted 0
[ -1.80750991e+08]
False
at minima -180750994.7281 accepted 0
[ -1.80746874e+08]
False
at minima -180746878.2433 accepted 0
[ -1.80746874e+08]
False
at minima -180746877.5774 accepted 0
[ -1.80746874e+08]
False
at minima -180746878.3173 accepted 0
[ -1.80750990e+08]
False
at minima -180750994.3509 accepted 0
[ -1.80750991e+08]
False
at minima -180750994.6605 accepted 0
[ -1.80746874e+08]
False
at minima -180746877.6966 accepted 0
[ -1.80746874e+08]
False
at minima -180746877.6900 accepted 0
[ -1.80750990e+08]
False
at minima -180750993.9707 accepted 0
[ -1.80750990e+08]
False
at minima -180750994.0494 accepted 0
[ -1.80750991e+08]
False
at minima -180750994.5824 accepted 0
[ -1.80746874e+08]
False
at minima -180746877.5459 accepted 0
[ -1.80750991e+08]
False
at minima -180750994.6679 accepted 0
[ -1.80750991e+08]
False
at minima -180750994.5823 accepted 0
[ -1.80750990e+08]
False
at minima -180750993.9308 accepted 0
[ -1.80746874e+08]
False
at minima -180746878.0395 accepted 0
[ -1.80750991e+08]
False
# ... etc.
So it's never accepting the posposal points. The output is not telling you that it has found a solution. It's telling you than the random perturbing to explore possible solutions keeps resulting in points that look better and better to the optimizer, but which keep failing to satisfy your criteria. It can't find its way all the way back to [0,1]
to get points that do satisfy mybounds
.