I am new to FFTs and signal processing, so hopefully this question makes sense and/or isn't stupid.
I would like to perform spectrum analysis on a live audio signal. My goal is to find a good tradeoff between responsiveness and frequency resolution, such that I can take a guess at the pitch of the incoming audio in near-realtime.
From what I've gathered about the math behind the Fourier transform, there is an inherent balance between sample size and frequency resolution. The bigger the sample, the better resolution. Since I am trying to minimize sample size (to attain the near-realtime requirement), this means my resolution suffers (each slot in the output buffer corresponds to a wide frequency range, which is undesirable).
However, for my intended application, I don't care about most of the spectrum. I only need spectrum info for a narrow frequency range, say 100hz - 1600hz for example. Is there any way to modify an FFT implementation such that I can improve the resolution of the frequency domain output while keeping the input buffer size constant (and small)? In other words, can I trade output total bandwidth for output resolution? If so, how is this done?
Although I have a weak grasp of the math at best, it seems that padding the input buffer with zeros might be interesting, no?
Thanks in advance for any help you can offer.
You can't get additional information from nowhere, but you can reduce latency by overlapping successive FFTs. For real-time power spectrum estimates it's common to overlap successive input windows by 50%.
Inserting zeroes between samples is another useful trick - it gives you more apparent resolution in the output bins, but in reality all you are doing is interpolating, i.e. there is no additional information gained (of course). You might find this technique useful though, in addition to the overlap suggestion above.