I want to write a new optimization algorithm for my network on Tensorflow. I hope to implement the Levenberg Marquardt optimization algorithm, which now is excluded from TF API. I found poor documentation on how to write a custom optimizer, so i ask if someone can give my any advice. Thanks.
The simplest example of an optimizer is probably the gradient descent optimizer. It shows how one creates an instance of the basic optimizer class. The optimizer base class documentation explains what the methods do.
The python side of the optimizers adds new nodes to the graph that compute and apply the gradients being back-propagated. It supplies the parameters that get passed to the ops and does some of the high-level management of the optimizer. Then, you need the actual "Apply" op.
Ops have both a python and a C++ component. Writing a training op is the same (but specialized) as the general process of adding an Op to TensorFlow.
For an example set of training ops that compute and apply gradients, see python/training/training_ops.py - this is the Python glue for the actual training ops. Note that the code here is mostly about shape inference - the computation is going to be in the C++.
The actual math for applying the gradients is handled by an Op (recalling that, in general, ops are written in C++). In this case, the apply gradients ops are defined in core/kernels/training_ops.cc. You can see, for example, the implementation of ApplyGradientDescentOp in there, which references a functor ApplyGradientDescent:
var.device(d) -= grad * lr();
The implementation of the Op itself follows the implementation of any other op as described in the adding-an-op docs.