I have been looking through this fantastic article: http://blogs.zynaptiq.com/bernsee/pitch-shifting-using-the-ft/
While being fantastic, it is extremely hard and heavy going. This material is really stretching me.
I have extracted the maths from Stefan's code module that calculates the exact frequency for a given bin. But I don't understand the last calculation. Can someone explain to me the mathematical construction at the end?
Before digging into the code, let me set the scene:
Let's say we set fftFrameSize = 1024, so we are dealing with 512+1 bins
As an example, Bin[1]'s ideal frequency fits a single wave in the frame. At a sample rate of 40KHz, tOneFrame = 1024/40K seconds = 1/40s, so Bin[1] would ideally be collecting a 40Hz signal.
Setting osamp (overSample) = 4, we progress along our input signal in steps of 256. So the first analysis examines bytes zero through 1023, then 256 through 1279, etc. Note each float gets processed 4 times.
...
void calcBins(
long fftFrameSize,
long osamp,
float sampleRate,
float * floats,
BIN * bins
)
{
/* initialize our static arrays */
static float gFFTworksp[2*MAX_FRAME_LENGTH];
static float gLastPhase[MAX_FRAME_LENGTH/2+1];
static long gInit = 0;
if (! gInit)
{
memset(gFFTworksp, 0, 2*MAX_FRAME_LENGTH*sizeof(float));
memset(gLastPhase, 0, (MAX_FRAME_LENGTH/2+1)*sizeof(float));
gInit = 1;
}
/* do windowing and re,im interleave */
for (long k = 0; k < fftFrameSize; k++)
{
double window = -.5*cos(2.*M_PI*(double)k/(double)fftFrameSize)+.5;
gFFTworksp[2*k] = floats[k] * window;
printf("sinValue: %f", gFFTworksp[2*k]);
gFFTworksp[2*k+1] = 0.;
}
/* do transform */
smbFft(gFFTworksp, fftFrameSize, -1);
printf("\n");
/* this is the analysis step */
for (long k = 0; k <= fftFrameSize/2; k++)
{
/* de-interlace FFT buffer */
double real = gFFTworksp[2*k];
double imag = gFFTworksp[2*k+1];
/* compute magnitude and phase */
double magn = 2.*sqrt(real*real + imag*imag);
double phase = atan2(imag,real);
/* compute phase difference */
double phaseDiff = phase - gLastPhase[k];
gLastPhase[k] = phase;
/* subtract expected phase difference */
double binPhaseOffset = M_TWOPI * (double)k / (double)osamp;
double deltaPhase = phaseDiff - binPhaseOffset;
/* map delta phase into [-Pi, Pi) interval */
// better, but obfuscatory...
// deltaPhase -= M_TWOPI * floor(deltaPhase / M_TWOPI + .5);
while (deltaPhase >= M_PI)
deltaPhase -= M_TWOPI;
while (deltaPhase < -M_PI)
deltaPhase += M_TWOPI;
(EDIT:) Now the bit I don't get:
// Get deviation from bin frequency from the +/- Pi interval
// Compute the k-th partials' true frequency
// Start with bin's ideal frequency
double bin0Freq = (double)sampleRate / (double)fftFrameSize;
bins[k].idealFreq = (double)k * bin0Freq;
// Add deltaFreq
double sampleTime = 1. / (double)sampleRate;
double samplesInStep = (double)fftFrameSize / (double)osamp;
double stepTime = sampleTime * samplesInStep;
double deltaTime = stepTime;
// Definition of frequency is rate of change of phase, i.e. f = dϕ/dt
// double deltaPhaseUnit = deltaPhase / M_TWOPI; // range [-.5, .5)
double freqAdjust = (1. / M_TWOPI) * deltaPhase / deltaTime;
// Actual freq <-- WHY ???
bins[k].freq = bins[k].idealFreq + freqAdjust;
}
}
I just can't see it clearly, even though it seems to be staring in the face. Could someone please explain this process from scratch, step by step?
The basic principle is very simple. If a given component exactly matches a bin frequency then its phase will not change from one FT to the next. However if the frequency does not correspond exactly with the bin frequency then there will be a phase change between successive FTs. The frequency delta is just:
delta_freq = delta_phase / delta_time
and the refined estimate of the frequency of the component will then be:
freq_est = bin_freq + delta_freq