Using fminunc function

rajan sthapit picture rajan sthapit · May 14, 2012 · Viewed 12.3k times · Source

I am trying to use the fminunc function for convex optimization. However, in my case I am taking the gradient with respect with logx. Let my objective function be F. Then the gradient will be

dF/dx = (dF/dlogx) * (1/x)
= > dF/dlogx = (dF/dx) * x

So

logx_new = logx_old + learning_rate * x * (dF/logx)
x_new = exp(logx_new)

How can I implement this in fminunc

Answer

Gunther Struyf picture Gunther Struyf · May 14, 2012

It's possible and described in the documentation:

If the gradient of fun can also be computed and the GradObj option is 'on', as set by options = optimset('GradObj','on') then the function fun must return, in the second output argument, the gradient value g, a vector, at x.

fminunc with custom gradient

So for example: if f = @(x) x.^2; then df/dx = 2*x and you can use

function [f df] = f_and_df(x)
    f = x.^2;
    if nargout>1
        df = 2*x;
    end
end

You can then pass that function to fminunc:

options = optimset('GradObj','on');
x0 = 5;
[x,fval] = fminunc(@f_and_df,x0,options);

fminunc with logx gradient

For your logx gradient, this becomes:

function [f df] = f_and_df(x)
    f = ...;
    if nargout>1
        df =  x * (dF/logx);
    end
end

and the fminunc stays the same.

fminunc with anonymous function

If you want, you can also use anonymous functions:

f_and_df2 = @(x) deal(x(1).^2+x(2).^2,[2*x(1)  2*x(2)]);
[x,fval] = fminunc(f_and_df2,[5, 4],optimset('GradObj','on'))

Example of fminunc with logx gradient

Additional example for f = (log(x))^2

function [f df_dlogx] = f_and_df(x)
    f = log(x).^2;

    df_dx = 2*log(x)./x;
    df_dlogx = df_dx.* x;
end

and then:

>>x0=3;
>>[x,fval] = fminunc(@f_and_df,x0,optimset('GradObj','on'))
x =
   0.999999990550151

fval =
   8.92996430424197e-17

Example of fminunc with custom gradient and multiple variables

For multiple variables e.g. f(x,y), you'll have to put your variables into a vector, example:

function [f df_dx] = f_and_df(x)
    f = x(1).2 + x(2).^2;

    df_dx(1) = 2*x(1);
    df_dx(2) = 2*x(2);
end

This function corresponds to a paraboloid. Of course you'll also have to use a vector for the initial starting parameters, in this case eg: x0=[-5 3]