I couldn't find any good explanation about YOLOv3 SPP which has better mAP
than YOLOv3. The author himself states YOLOv3 SPP as this on his repo:
YOLOv3 with spatial pyramid pooling, or something
But still I don't really understand it. In yolov3-spp.cfg
I notice there are some additions
575 ### SPP ###
576 [maxpool]
577 stride=1
578 size=5
579
580 [route]
581 layers=-2
582
583 [maxpool]
584 stride=1
585 size=9
586
587 [route]
588 layers=-4
589
590 [maxpool]
591 stride=1
592 size=13
593
594 [route]
595 layers=-1,-3,-5,-6
596
597 ### End SPP ###
598
599 [convolutional]
600 batch_normalize=1
601 filters=512
602 size=1
603 stride=1
604 pad=1
605 activation=leaky
Anybody can give further explanation about how YOLOv3 SPP works? Why layers -2, -4 and -1, -3, -5, -6 are chosen in [route] layers
? Thanks.
Finally some researchers published a paper about SPP application in Yolo https://arxiv.org/abs/1903.08589.
For yolov3-tiny, yolov3, and yolov3-spp differences :
But they got only mAP = 79.6% on Pascal VOC 2007 test with using Yolov3SPP-model on original framework.
But we can achive higher accuracy mAP = 82.1% even with yolov3.cfg model by using AlexeyAB's repository https://github.com/AlexeyAB/darknet/issues/2557#issuecomment-474187706
And for sure we can achieve even higher mAP with yolov3-spp.cfg using Alexey's repo.
Original github question : https://github.com/AlexeyAB/darknet/issues/2859