I am working on a project in my school to detect how many students are in the classroom. Like in this picture.
I have been trying to use Haar Cascade in opencv for face detection to detect people, but the result is very bad. Like this:
I took thousands of pictures in classroom, and cropped the picture with people manually. There are about 4000 positive samples and 12000 negative samples. I was wondering what did I do wrong? When I crop the image, should I only crop only head like this? Or like this with body?
I think I had enough training samples, and I follow the exact procedure with this post: http://note.sonots.com/SciSoftware/haartraining.html#v6f077ba which should be working. Or should I use a different algorithm like HOG or SVM. Any suggestion would be great for me, I have been stuck in this for months and don't have any clue. Thanks a lot!
Haar is better for human face. Hog with SVM is classic for human detection and there've been lots of source and blogs about them, it's not hard to train a classifier. For your scene, I think 'head and shoulder' is better than 'head alone'. But your multi-view samples increase the difficulty. A facing cam would be better. Add more hard neg samples if you always have much more false positive alarms. This paper may help: http://irip.buaa.edu.cn/~zxzhang/papers/icip2009-1.pdf