Faster Image Processing than Lock Bits

Question 1

Faster Image Processing than Lock Bits

c# image image-processing edge-detection lockbits

vkoves · Apr 18, 2013 · Viewed 7.1k times · Source

Answer

Answer

Are you really passing in a floating point number as a percentage to isMatching?

I looked at your code for isMatching on GitHub and well, yikes. You ported this from Java, right? C# uses bool not Boolean and while I don't know for sure, I don't like the looks of code that does that much boxing and unboxing. Further, you're doing a ton of floating point multiplication and comparison when you don't need to:

public static bool IsMatching(Color a, Color b, int percent)
{
    //this method is used to identify whether two pixels, 
    //of color a and b match, as in they can be considered
    //a solid color based on the acceptance value (percent)

    int thresh = (int)(percent * 255);

    return Math.Abs(a.R - b.R) < thresh &&
           Math.Abs(a.G - b.G) < thresh &&
           Math.Abs(a.B - b.B) < thresh;
}

This will cut down the amount of work you're doing per pixel. I still don't like it because I try to avoid method calls in the middle of a per-pixel loop especially an 8x per-pixel loop. I made the method static to cut down on an instance being passed in that isn't used. These changes alone will probably double your performance since we're doing only 1 multiply, no boxing, and are now using the inherent short-circuit of && to cut down the work.

If I were doing this, I'd be more likely to do something like this:

// assert: bitmap.Height > 2 && bitmap.Width > 2
BitmapData data = bitmap.LockBits(new Rectangle(0, 0, bitmap.Width, bitmap.Height),
                      ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);

int scaledPercent = percent * 255;
unsafe {
    byte* prevLine = (byte*)data.Scan0;
    byte* currLine = prevLine + data.Stride;
    byte* nextLine = currLine + data.Stride;

    for (int y=1; y < bitmap.Height - 1; y++) {

       byte* pp = prevLine + 3;
       byte* cp = currLine + 3;
       byte* np = nextLine + 3;
       for (int x = 1; x < bitmap.Width - 1; x++) {
           if (IsEdgeOptimized(pp, cp, np, scaledPercent))
           {
               // do what you need to do
           }
           pp += 3; cp += 3; np += 3;
       }
       prevLine = currLine;
       currLine = nextLine;
       nextLine += data.Stride;
    }
}

private unsafe static bool IsEdgeOptimized(byte* pp, byte* cp, byte* np, int scaledPecent)
{
    return IsMatching(cp, pp - 3, scaledPercent) &&
           IsMatching(cp, pp, scaledPercent) &&
           IsMatching(cp, pp + 3, scaledPercent) &&
           IsMatching(cp, cp - 3, scaledPercent) &&
           IsMatching(cp, cp + 3, scaledPercent) &&
           IsMatching(cp, np - 3, scaledPercent) &&
           IsMatching(cp, np, scaledPercent) &&
           IsMatching(cp, np + 3, scaledPercent);
}

private unsafe static bool IsMatching(byte* p1, byte* p2, int thresh)
{
    return Math.Abs(p1++ - p2++) < thresh &&
           Math.Abs(p1++ - p2++) < thresh &&
           Math.Abs(p1 - p2) < thresh;
}

Which now does all kinds of horrible pointer mangling to cut down on array accesses and so on. If all of this pointer work makes you feel uncomfortable, you can allocate byte arrays for prevLine, currLine and nextLine and do a Marshal.Copy for each row as you go.

The algorithm is this: start one pixel in from the top and left and iterate over every pixel in the image except the outside edge (no edge conditions! Yay!). I keep pointers to the starts of each line, prevLine, currLine, and nextLine. Then when I start the x loop, I make up pp, cp, np which are previous pixel, current pixel and next pixel. current pixel is really the one we care about. pp is the pixel directly above it, np directly below it. I pass those into IsEdgeOptimized which looks around cp, calling IsMatching for each.

Now this all assume 24 bits per pixel. If you're looking at 32 bits per pixel, all those magic 3's in there need to be 4's, but other than that the code doesn't change. You could parameterize the number of bytes per pixel if you want so it could handle either.

FYI, the channels in the pixels are typically b, g, r, (a).

Colors are stored as bytes in memory. Your actual Bitmap, if it is a 24 bit image is stored as a block of bytes. Scanlines are data.Stride bytes wide, which is at least as large as 3 * the number of pixels in a row (it may be larger because scan lines are often padded).

When I declare a variable of type byte * in C#, I'm doing a few things. First, I'm saying that this variable contains the address of a location of a byte in memory. Second, I'm saying that I'm about to violate all the safety measures in .NET because I could now read and write any byte in memory, which can be dangerous.

So when I have something like:

Math.Abs(*p1++ - *p2++) < thresh

What it says is (and this will be long):

Take the byte that p1 points to and hold onto it
Add 1 to p1 (this is the ++ - it makes the pointer point to the next byte)
Take the byte that p2 points to and hold onto it
Add 1 to p2
Subtract step 3 from step 1
Pass that to Math.Abs.

The reasoning behind this is that, historically, reading the contents of a byte and moving forward is a very common operation and one that many CPUs build into a single operation of a couple instructions that pipeline into a single cycle or so.

When we enter IsMatching, p1 points to pixel 1, p2 points to pixel 2 and in memory they are laid out like this:

p1    : B
p1 + 1: G
p1 + 2: R

p2    : B
p2 + 1: G
p2 + 2: R

So IsMatching just does the the absolute difference while stepping through memory.

Your follow-on question tells me that you don't really understand pointers. That's OK - you can probably learn them. Honestly, the concepts really aren't that hard, but the problem with them is that without a lot of experience, you are quite likely to shoot yourself in the foot, and perhaps you should consider just using a profiling tool on your code and cooling down the worst hot spots and call it good.

For example, you'll note that I look from the first row to the penultimate row and the first column to the penultimate column. This is intentional to avoid having to handle the case of "I can't read above the 0th line", which eliminates a big class of potential bugs which would involve reading outside a legal memory block, which may be benign under many runtime conditions.

Question 2

I've been working on an edge detection program in C#, and to make it run faster, I recently made it use lock bits. However, lockBits is still not as fast as I would like it to run. Although the problem could be my general algorithm, I'm also wondering if there is anything better than lockBits I can use for image processing.

In case the problem is the algorithm, here's a basic explanation. Go through an array of Colors (made using lockbits, which represent pixels) and for each Color, check the color of the eight pixels around that pixel. If those pixels do not match the current pixel closely enough, consider the current pixel an edge.

Here's the basic code that defines if a pixel is an edge. It takes in a Color[] of nine colors, the first of which is the pixel is to check.

public Boolean isEdgeOptimized(Color[] colors)
{
    //colors[0] should be the checking pixel
    Boolean returnBool = true;
    float percentage = percentageInt; //the percentage used is set
    //equal to the global variable percentageInt

    if (isMatching(colors[0], colors[1], percentage) &&
            isMatching(colors[0], colors[2], percentage) &&
            isMatching(colors[0], colors[3], percentage) &&
            isMatching(colors[0], colors[4], percentage) &&
            isMatching(colors[0], colors[5], percentage) &&
            isMatching(colors[0], colors[6], percentage) &&
            isMatching(colors[0], colors[7], percentage) &&
            isMatching(colors[0], colors[8], percentage))
    {
        returnBool = false;
    }
    return returnBool;
}

This code is applied for every pixel, the colors of which are fetched using lockbits.

So basically, the question is, how can I get my program to run faster? Is it my algorithm, or is there something I can use that is faster than lockBits?

By the way, the project is on gitHub, here

Faster Image Processing than Lock Bits

Answer

Related questions