Perceptual Image Downsampling

tzenes picture tzenes · Nov 22, 2009 · Viewed 10.2k times · Source

So here is my problem:

I have an image, that image is large (high resolution) and it needs to be small (much lower resolution).

So I do the naive thing (kill every other pixel) and the result looks poor.

So I try to do something more intelligent (low pass filtering using a Fourier transform and re-sampling in Fourier space) and the result is a little better but still fairly poor.

So my question, is there a perceptually motivated image down-sampling algorithm (or implementation)?

edit: While I am aware of a number of resampling techniques, my application is more concerned with preserving the perceptual features, rather than producing smooth images.

edit2: it is safe to assume I have some level of familiarity with digital signal processing, convolutions, wavelet transforms, etc

Answer

Ori Pessach picture Ori Pessach · Nov 23, 2009

Read this:

http://www.dspguide.com/

OK, that's quite a read. But understanding filter design would be handy.

In general, the process for scaling an image from W1 x H1 to W2 x H2 where W1, W2, H1, H2 are integers, is to find new W3, H3 so that W1 and W2 are integer factors of W3 and H1 and H2 are integer factors of H3, and then pad the original image with zeros (used to space the pixels of the original image) so that it's now W3 x H3 in size. This introduces high frequencies due to discontinuities in the image, so you apply a low-pass filter to the image, and then decimate the filtered image to its new size (W2 x H2). Sounds like you might be trying to do this already, but the filtering can be done in the time domain so that the Fourier transform isn't really necessary.

In practice, the process I just described is optimized (you'll note that when applying a convolution filter to the upscaled image most of the terms will be 0, so you can avoid most of the multiplication operations in your algorithm, for example. And since you end up throwing away many of the filtered results, you don't need to calculate those, so you end up with a handful of multiplications and additions for each pixel in the target image, basically. The trick is to figure out which coefficients to use.)

libswscale in the ffmpeg project does something like this, I believe. Check it out:

http://gitorious.org/libswscale

As others pointed out, (and you apparently noticed) decimating the image introduces aliasing artifacts. I can't be sure about your resampling implementation, but the technique has interesting gotchas depending on the window size you use and other implementation details.