Determining frequency of an array in Python

y33t picture y33t · Dec 20, 2011 · Viewed 16k times · Source

I have a sample file filled with floating point numbers as follows:

    -0.02  3.04  3.04  3.02  3.02  3.06  3.04  3.02  3.04  3.02  3.04  3.02
     3.04  3.02  3.04  3.04  3.04  3.02  3.04  3.02  3.04  3.02  3.04  3.02
     3.06  3.02  3.04  3.02  3.04  3.02  3.02  3.06  3.04  3.02  3.04  3.02
     3.04  3.02  3.04  3.04  3.04  3.02  3.04  3.02  3.02  3.06  3.04  3.02
     3.06  3.02  3.04 -0.02 -0.02 -0.02 -0.02 -0.02 -0.02 -0.04 -0.02 -0.04

These numbers are placed in a text file. I am trying to read the text file and determine the frequency of this signal. This data is captured from a digital oscilloscope. I can see the frequency in the scope display but I also want to validate it by processing it in Python. I capture data from the device with Python on the PC side.

Even though I can do some low-level stuff in Python, I am a total newbie to text processing. I suppose I need to first load the data in file to an array and then perform an FFT or a simpler algorithm that will yield to an integer in Hz.

In theory I know how to perform a Fourier analysis and I can do it in paper for any given signal. I have no idea where to start in Python for a given dataset. I already tried documentation of scipy-numpy but didn't work that well for me.

I would appreciate guidance from experienced users.

Answer

David Z picture David Z · Dec 20, 2011

It's not clear from your question exactly what the values in the file represent. But assuming that they indicate consecutive voltage samples, you can load the file into a Numpy array using

import numpy as np
data = np.array([float(f) for f in file(filename).read().split()])

and then compute the Fourier transform as

import numpy.fft as fft
spectrum = fft.fft(data)

You can then plot the magnitudes of the FFT as

freq = fft.fftfreq(len(spectrum))
plot(freq, abs(spectrum))

and what you see should match up with what displays on the oscilloscope.

If you want to identify the dominant frequencies in the spectrum, you'll have to cut the array at some threshold, e.g. something like this:

threshold = 0.5 * max(abs(spectrum))
mask = abs(spectrum) > threshold
peaks = freq[mask]

The contents of freq (and thus also peaks) are the frequencies in units of the sampling rate. For example, if your oscilloscope samples the waveform every microsecond, the values in freq are in megahertz. So if you feed in an ideal 1 kHz signal, which you can do with e.g.

t = arange(4e6) / 1e6 # sampling times in seconds
data = sin(2 * pi * 1000 * t)

you would get a peak at 0.001 MHz, and accordingly you will find that peaks = array([-0.001, 0.001]).