I am trying to use the fminunc function for convex optimization. However, in my case I am taking the gradient with respect with logx. Let my objective function be F. Then the gradient will be
dF/dx = (dF/dlogx) * (1/x)
= > dF/dlogx = (dF/dx) * x
So
logx_new = logx_old + learning_rate * x * (dF/logx)
x_new = exp(logx_new)
How can I implement this in fminunc
It's possible and described in the documentation:
If the gradient of fun can also be computed and the GradObj option is 'on', as set by options = optimset('GradObj','on') then the function fun must return, in the second output argument, the gradient value g, a vector, at x.
So for example: if f = @(x) x.^2;
then df/dx = 2*x
and you can use
function [f df] = f_and_df(x)
f = x.^2;
if nargout>1
df = 2*x;
end
end
You can then pass that function to fminunc
:
options = optimset('GradObj','on');
x0 = 5;
[x,fval] = fminunc(@f_and_df,x0,options);
For your logx gradient, this becomes:
function [f df] = f_and_df(x)
f = ...;
if nargout>1
df = x * (dF/logx);
end
end
and the fminunc
stays the same.
If you want, you can also use anonymous functions:
f_and_df2 = @(x) deal(x(1).^2+x(2).^2,[2*x(1) 2*x(2)]);
[x,fval] = fminunc(f_and_df2,[5, 4],optimset('GradObj','on'))
Additional example for f = (log(x))^2
function [f df_dlogx] = f_and_df(x)
f = log(x).^2;
df_dx = 2*log(x)./x;
df_dlogx = df_dx.* x;
end
and then:
>>x0=3;
>>[x,fval] = fminunc(@f_and_df,x0,optimset('GradObj','on'))
x =
0.999999990550151
fval =
8.92996430424197e-17
For multiple variables e.g. f(x,y), you'll have to put your variables into a vector, example:
function [f df_dx] = f_and_df(x)
f = x(1).2 + x(2).^2;
df_dx(1) = 2*x(1);
df_dx(2) = 2*x(2);
end
This function corresponds to a paraboloid. Of course you'll also have to use a vector for the initial starting parameters, in this case eg: x0=[-5 3]