I've been searching Google for a while now to find pseudo-code of a decently efficient 8x8 (or nxn) DCT algorithm, and I can't find anything!
I implemented the naive approach, and it took far too long to execute.
If you could post some pseudo-code or reference a good book/document/website, that would be helpful.
C or C++ examples would be better yet!
As requested in the comments, source (be slightly warned, it's in C#, but the difference with C++ should be minimal, and yes I know the code is lame):
Main loop (A = result, B = input):
for (int y = 0; y < 8; y++)
{
for (int x = 0; x < 8; x++)
{
A[y * 8 + x] = 0;
for (int u = 0; u < 8; u++)
for (int v = 0; v < 8; v++)
A[y * 8 + x] += alpha(u) * alpha(v) * B[u, v] *
cosine[u, x] * cosine[v, y];
}
}
Support stuff:
static double alpha(int i)
{
if (i == 0)
return SQRT2o2 * 0.5;
return 0.5;
}
const double SQRT2o2 = 1.414213562373095048801688724209 * 0.5;
cosine = new double[8, 8];
const double inv16 = 1.0 / 16.0;
for (int i = 0; i < 8; i++)
{
for (int j = 0; j < 8; j++)
{
cosine[j, i] = Math.Cos(Math.PI * j * (2.0 * i + 1) * inv16);
}
}
edit: I timed it - for 512 by 512 pixels (single channel) it takes half a second. Sure that's slow, but nowhere near "forever".