OpenCV: Improving the speed of Cascades detection

JustCause picture JustCause · Jul 25, 2014 · Viewed 7.8k times · Source

I need to detect people in real time using OpenCV Cascades. Currently I am using the trained cascade files which comes with OpenCV but later I will train my own LBP Cascades to achieve more speed. I do have a question.

what are the ways to speed up the detection of cascades? For an example, have a look at this video. It is really fast, uses Haar cascades and nice. what kind of things I can do to achieve this speed, specially for a real time application? any tricks and hacks?

Answer

bjou picture bjou · Jul 29, 2014

I'm not sure what you mean by "speed" in your video example since it's hard to make out what "speed" the detections are done at there. In computer vision, when we talk about the "speed" of detections, we generally mean the frames per second (FPS) or the millisecond run-time of the algorithm for a single or set of videos. If the FPS achieved by the algorithm is same as the FPS of the input video, this is called real-time or 1x processing speed. If processing FPS is greater than the input FPS, you have faster than real-time processing and if it is smaller, then you have slower than real-time. I will assume you meant the same thing when you said "speed".

Given this, let me give you two ways to speed up the detections. I really suggest reading these two papers that have really set the bar in pedestrian detection in the past several years: The Fastest Pedestrian Detector in the West and Pedestrian detection at 100 frames per second, both optimizing on the computation bottleneck of performing detection at multiple scales in the traditional detection setting. The latter has publicly available code here and here. But so this is one of the areas to gain improvement: scale sizes.

The method implemented natively in OpenCV is based on a variant of the Viola-Jones method that extends the Haar-like feature set used in detection. Another area of improvement to consider is called windowing. Traditional detection methods, including the one implemented natively in OpenCV, require that you slide windows at scale across the image, usually row-wise from the upper-left to the bottom-right. A classic way to get around this is called Efficient Subwindow Search (ESS) which performs branch-and-bound optimization. There have been many extensions building from this, but it's an excellent place to start and understand the basics of object detection.

Now, of course, one very obvious way to speed-up the detection process is to parallelize your code, e.g. multi-threading or GPU. There are several GPU implementations that are publicly available, e.g. here using a support vector machine-based detector.