Object Tracking in EmguCV

Peter picture Peter · Jan 5, 2012 · Viewed 17.8k times · Source

I am building an object tracking program that should track the unknown object. The user must select a region in the live video stream that should be tracked. My project is similar to this video.

http://www.youtube.com/watch?v=G5GLIKIkd6E

I have tried a method but it is not robust enough and the tracker moves a lot. So I am starting from scratch again.

Anyone knows a method on how I can come up with the one in the video? I am a newbie in emgucv and as of now I really have no idea where to start again.

Answer

Chris picture Chris · Jan 6, 2012

The video suggest template matching which due to speed I expect it's more likely to be a FFT (Fast Fourier Transform) Method, this is fairly easy to implement in EMGU however getting it perfect is hard.


Template Matching

First the template matching method I have made a method that will match an object within an image you feed into it FFT only works on single spectrum images for colour you will have to split the spectrum's and add the results matrices together:

Point Location;

private bool Detect_object(Image<Gray, Byte> Area_Image, Image<Gray, Byte> image_object)
{
    bool success = false;

    //Work out padding array size
    Point dftSize = new Point(Area_Image.Width + (image_object.Width * 2), Area_Image.Height + (image_object.Height * 2));
    //Pad the Array with zeros
    using (Image<Gray, Byte> pad_array = new Image<Gray, Byte>(dftSize.X, dftSize.Y))
    {
        //copy centre
        pad_array.ROI = new Rectangle(image_object.Width, image_object.Height, Area_Image.Width, Area_Image.Height);
        CvInvoke.cvCopy(Area_Image, pad_array, IntPtr.Zero);

        pad_array.ROI = (new Rectangle(0, 0, dftSize.X, dftSize.Y));

        //Match Template
        using (Image<Gray, float> result_Matrix = pad_array.MatchTemplate(image_object, TM_TYPE.CV_TM_CCOEFF_NORMED))
        {
            Point[] MAX_Loc, Min_Loc;
            double[] min, max;
            //Limit ROI to look for Match

            result_Matrix.ROI = new Rectangle(image_object.Width, image_object.Height, Area_Image.Width - image_object.Width, Area_Image.Height - image_object.Height);

            result_Matrix.MinMax(out min, out max, out Min_Loc, out MAX_Loc);

            Location = new Point((MAX_Loc[0].X), (MAX_Loc[0].Y));
            success = true;
            Results =result_Matrix.Convert<Gray,Double>();

        }
    }
    return success;
}

The thing most people forget is to pad the array with zeros akin to the size of the template we use zeros as this has no effect on the fft method. We pad the matrix else we don't process the data around the edge properly and we can miss matching items.

The second point and I cant stress how important this is is that the FFT method will at the moment return a match to the objects top left hand corner. result_Matrix.MinMax finds the place in which the object is most likely to have matched. There is a lot that you will need to experiment with so any more problems ask here or EMGU and I'll help when I can. I will copy and paste this solution over as well.


The Method in the Video

Well I will leave you to code most of this as I am stuck for time, but in effect the user uses the click event of a paintbox to find set e.X and e.Y location of an object within the image. The template is of a fixed sized so 100x100

Image<Gray, Byte> template_img = Main_Image.Copy(new Rectangle(x, y, 100, 100);

He then sets an ROI on the original image around the object this accounts for movement. In our case say we want a buffer (ROI) around the template of 50 pixels. This would equate to an intial ROI of:

Main_Image.ROI = new Rectangle(x - 50, y - 50, 200, 200);

Now since working with an ROI of an image we can slow down the processing as well as mess up displaying the original image again so it would be much better to do something like this:

using( Image<Gray, Byte> img_ROI = Main_Image.Copy(new Rectangle(x - 50, y - 50, 200, 200))
{
    Detect_object(img_ROI, template_img)
}

We use a using statement as this disposes of the extra image data when we've finished and frees up resources.

Now for the trick the ROI is actually controlled by the results from the Detect_object which is why we keep Location as a global variable. Once Location we have matched the template successfully our using statement will look more like:

using( Image<Gray, Byte> img_ROI = Main_Image.Copy(new Rectangle(Location.X - 50, Location.Y - 50, 200, 200)) 
{
    ...
}

That's pretty much it other than rectangles of the ROI and template, size and location are drawn on the image if you have problems with that let me know but the code should readily be out there,

Cheers,

Chris