Understanding darknet's yolo.cfg config files

Reda Drissi picture Reda Drissi · May 17, 2018 · Viewed 25.3k times · Source

I have searched around the internet but found very little information around this, I don't understand what each variable/value represents in yolo's .cfg files. So I was hoping some of you could help, I don't think I'm the only one having this problem, so if anyone knows 2 or 3 variables please post them so that people who needs such info in the future might find them.

The main one that I'd like to know are :

  • batch
  • subdivisions

  • decay

  • momentum

  • channels

  • filters

  • activation

Answer

FelEnd picture FelEnd · Jun 5, 2018

Here is my current understanding of some of the variables. Not necessarily correct though:

[net]

  • batch: That many images+labels are used in the forward pass to compute a gradient and update the weights via backpropagation.
  • subdivisions: The batch is subdivided in this many "blocks". The images of a block are ran in parallel on the gpu.
  • decay: Maybe a term to diminish the weights to avoid having large values. For stability reasons I guess.
  • channels: Better explained in this image :

On the left we have a single channel with 4x4 pixels, The reorganization layer reduces the size to half then creates 4 channels with adjacent pixels in different channels. figure

  • momentum: I guess the new gradient is computed by momentum * previous_gradient + (1-momentum) * gradient_of_current_batch. Makes the gradient more stable.
  • adam: Uses the adam optimizer? Doesn't work for me though
  • burn_in: For the first x batches, slowly increase the learning rate until its final value (your learning_rate parameter value). Use this to decide on a learning rate by monitoring until what value the loss decreases (before it starts to diverge).
  • policy=steps: Use the steps and scales parameters below to adjust the learning rate during training
  • steps=500,1000: Adjust the learning rate after 500 and 1000 batches
  • scales=0.1,0.2: After 500, multiply the LR by 0.1, then after 1000 multiply again by 0.2
  • angle: augment image by rotation up to this angle (in degree)

layers

  • filters: How many convolutional kernels there are in a layer.
  • activation: Activation function, relu, leaky relu, etc. See src/activations.h
  • stopbackward: Do backpropagation until this layer only. Put it in the panultimate convolution layer before the first yolo layer to train only the layers behind that, e.g. when using pretrained weights.
  • random: Put in the yolo layers. If set to 1 do data augmentation by resizing the images to different sizes every few batches. Use to generalize over object sizes.

Many things are more or less self-explanatory (size, stride, batch_normalize, max_batches, width, height). If you have more questions, feel free to comment.

Again, please keep in mind that I am not 100% certain about many of those.