I am currently working on a system for robust hand detection.
The first step is to take a photo of the hand (in HSV color space) with the hand placed in a small rectangle to determine the skin color. I then apply a thresholding filter to set all non-skin pixels to black and all skin pixels white.
So far it works quite well, but I wanted to ask if there is a better way to solve this? For example, I found a few papers mentioning concrete color spaces for caucasian people, but none with a comparison for asian/african/caucasian color-tones.
By the way, I'm working with OpenCV via Python bindings.
Have you taken a look at the camshift paper by Gary Bradski? You can download it from here
I used the the skin detection algorithm a year ago for detecting skin regions for hand tracking and it is robust. It depends on how you use it.
The first problem with using color for tracking is that it is not robust to lighting variations or like you mentioned, when people have different skin tones. However this can be solved easily as mentioned in the paper by:
Throwing away the V channel in HSV and only considering H and S channels is really enough (surprisingly) to detect different skin tones and under different lighting variations. A plus side is that its computation is fast.
These steps and the corresponding code can be found in the original OpenCV book.
As a side note, I've also used Gaussian Mixture Models (GMM) before. If you are only considering color then I would say using histograms or GMM makes not much difference. In fact the histogram would perform better (if your GMM is not constructed to account for lighting variations etc.). GMM is good if your sample vectors are more sophisticated (i.e. you consider other features) but speed-wise histogram is much faster because computing the probability map using histogram is essentially a table lookup whereas GMM requires performing a matrix computation (for vector with dimension > 1 in the formula for multi-dimension gaussian distribution) which can be time consuming for real time applications.
So in conclusion, if you are only trying to detect skin regions using color, then go with the histogram method. You can adapt it to consider local gradient as well (i.e. histogram of gradients but possibly not going to the full extent of Dalal and Trigg's human detection algo.) so that it can differentiate between skin and regions with similar color (e.g. cardboard or wooden furniture) using the local texture information. But that would require more effort.
For sample source code on how to use histogram for skin detection, you can take a look at OpenCV"s page here. But do note that it is mentioned on that webpage that they only use the hue channel and that using both hue and saturation would give better result.
For a more sophisticated approach, you can take a look at the work on "Detecting naked people" by Margaret Fleck and David Forsyth. This was one of the earlier work on detecting skin regions that considers both color and texture. The details can be found here.
A great resource for source code related to computer vision and image processing, which happens to include code for visual tracking can be found here. And not, its not OpenCV.
Hope this helps.