I'm using Selenium to automate webpage functional testing. It's important for us to do a pixel-by-pixel comparison when we roll out new code, so we're using Selenium to take screenshots and comparing the base64 encoded strings to see if anything has changed.
We're finding that in practice, it's hard to get complete pixel consistency, especially with images. I would like minor blurriness / rendering artifacts to count as a "pass" instead of a "fail", so I'm wondering if there's a way of doing a fuzzy comparison to make our tests a bit less fragile.
I was thinking of maybe looking at the Levenshtein distance between the base64 strings as a starting point, but I don't really know if that's a good approach, or what the tolerances should be that distinguish "something moved on the page" from "rendering artifact". Any ideas / approaches?
So I ended up going with the ImageMagick command-line tool (because why re-invent image comparison). The "Peak Absolute Error" metric of the "compare" tool tells you how much you have to fuzz pixels before two images are identical. This seems to work well... for an image with slight graphical distortions, there might be a lot of pixels that don't match, but slight fuzzing is enough to make them match; but for two images that are actually different, even though most pixels might match, the ones that don't tend to be very different. Right now I'm checking for a PAE of less than 15% to see if the images should be counted as identical. Command line I'm using is:
compare -metric PAE original.png new.png comparison.png
Documentation on ImageMagick's compare tool is here: http://www.imagemagick.org/script/compare.php