I've been reading a lot about Fast Fourier Transform and am trying to understand the low-level aspect of it. Unfortunately, Google and Wikipedia are not helping much at all.. and I have like 5 different algorithm books open that aren't helping much either.
I'm trying to find the FFT of something simple like a vector [1,0,0,0]. Sure I could just plug it into Matlab but that won't help me understand what's going on underneath. Also, when I say I want to find the FFT of a vector, is that the same as saying I want to find the DFT of a vector just with a more efficient algorithm?
You're right, "the" Fast Fourier transform is just a name for any algorithm that computes the discrete Fourier transform in O(n log n) time, and there are several such algorithms.
Here's the simplest explanation of the DFT and FFT as I think of them, and also examples for small N, which may help. (Note that there are alternative interpretations, and other algorithms.)
Given N
numbers f0, f1, f2, …, fN-1, the DFT gives a different set of N
numbers.
Specifically: Let ω be a primitive Nth root of 1 (either in the complex numbers or in some finite field), which means that ωN=1 but no smaller power is 1. You can think of the fk's as the coefficients of a polynomial P(x) = ∑fkxk. The N new numbers F0, F1, …, FN-1 that the DFT gives are the results of evaluating the polynomial at powers of ω. That is, for each n from 0 to N-1, the new number Fn is P(ωn) = ∑0≤k≤N-1 fkωnk.
[The reason for choosing ω is that the inverse DFT has a nice form, very similar to the DFT itself.]
Note that finding these F's naively takes O(N2) operations. But we can exploit the special structure that comes from the ω's we chose, and that allows us to do it in O(N log N). Any such algorithm is called the fast Fourier transform.
So here's one way of doing the FFT. I'll replace N with 2N to simplify notation. We have f0, f1, f2, …, f2N-1, and we want to compute P(ω0), P(ω1), … P(ω2N-1) where we can write
P(x) = Q(x) + ωNR(x) with
Q(x) = f0 + f1x + … + fN-1xN-1
R(x) = fN + fN+1x + … + f2N-1x2N-1
Now here's the beauty of the thing. Observe that the value at ωk+N is very simply related to the value at ωk:
P(ωk+N) = ωN(Q(ωk) + ωNR(ωk)) = R(ωk) + ωNQ(ωk). So the evaluations of Q and R at ω0 to ωN-1 are enough.
This means that your original problem — of evaluating the 2N-term polynomial P at 2N points ω0 to ω2N-1 — has been reduced to the two problems of evaluating the N-term polynomials Q and R at the N points ω0 to ωN-1. So the running time T(2N) = 2T(N) + O(N) and all that, which gives T(N) = O(N log N).
Note that other definitions put factors of 1/N or 1/√N.
For N=2, ω=-1, and the Fourier transform of (a,b) is (a+b, a-b).
For N=3, ω is the complex cube root of 1, and the Fourier transform of (a,b,c) is (a+b+c, a+bω+cω2, a+bω2+cω). (Since ω4=ω.)
For N=4 and ω=i, and the Fourier transform of (a,b,c,d) is (a+b+c+d, a+bi-c-di, a-b+c-d, a-bi-c+di). In particular, the example in your question: the DFT on (1,0,0,0) gives (1,1,1,1), not very illuminating perhaps.