Can't find the right energy using scipy.signal.welch

gravedigger picture gravedigger · Apr 3, 2015 · Viewed 8.2k times · Source

For a given discret time signal x(t) with spacing dt (which is equal to 1/fs, fs being the sample rate), the energy is:

E[x(t)] = sum(abs(x)**2.0)/fs

Then I do a DFT of x(t):

x_tf = np.fft.fftshift( np.fft.fft( x ) ) / ( fs * ( 2.0 * np.pi ) ** 0.5 )

and compute the energy again:

E[x_tf] = sum( abs( x_tf ) ** 2.0 ) * fs * 2 * np.pi / N

(here the factor fs*2*np.pi/N = pulsation spacing dk, the documentation of fftfreq gives more details about spacing in frequency domain), I have the same energy:

E[x(t)] = E[x_tf]

BUT... when I compute the power spectral density of x(t) using scipy.signal.welch, I can't find the right energy. scipy.signal.welch returns the vector of frequencies f and energy Pxx (or energy per frequency, depending on which scaling we enter in arguments of scipy.signal.welch).

How can I find the same energy as E[x(t)] or E[x_tf] using Pxx? I tried to compute:

E_psd = sum(Pxx_den) / nperseg

where nperseg being the length of each segment of Welch algorithm, factors like fs and np.sqrt(2*np.pi) being cancelled out, and rescale E[x(t)] with nperseg, but without any success (orders of magnitude smaller than E[x(t)] )

I used the following code to generate my signal:

#Generate a test signal, a 2 Vrms sine wave at 1234 Hz, corrupted by 0.001 V**2/Hz of white noise sampled at 10 kHz.

fs = 10e3   #sampling rate, dt = 1/fs
N = 1e5
amp = 2*np.sqrt(2)
freq = 1234.0
noise_power = 0.001 * fs / 2
time = np.arange(N) / fs
x = amp*np.sin(2*np.pi*freq*time)
x += np.random.normal(scale=np.sqrt(noise_power), size=time.shape)

and I did the following to get the power spectral density:

f, Pxx_den = signal.welch(x, fs )

Answer

E. Davis picture E. Davis · Oct 21, 2015

The resolution to this apparent discrepancy lies in a careful understanding and application of

  • continuous vs. discrete Fourier transforms, and
  • energy, power, and power spectral density of a given signal

I too have struggled with this exact question, so I will try to be as explicit as possible in the discussion below.

Discrete Fourier transform (DFT)

A continuous signal x(t) satisfying certain integrability conditions has a Fourier transform X(f). When working with a discrete signal x[n], however, it is often conventional to work with the discrete-time Fourier transform (DTFT). I will denote the DTFT as X_{dt}(f), where dt equals the time interval between adjacent samples. The key to answering your question requires that you recognize that the DTFT is not equal to the corresponding Fourier transform! In fact, the two are related as

X_{dt}(f) = (1 / dt) * X(f)

Further, the discrete Fourier transform (DFT) is simply a discrete sample of the DTFT. The DFT, of course, is what Python returns when using np.fft.fft(...). Thus, your computed DFT is not equal to the Fourier transform!

Power spectral density

scipy.signal.welch(..., scaling='density', ...) returns an estimate of the power spectral density (PSD) of discrete signal x[n]. A full discussion of the PSD is a bit beyond the scope of this post, but for a simple periodic signal (such as that in your example), the PSD S_{xx}(f) is given as

S_{xx} = |X(f)|^2 / T

where |X(f)| is the Fourier transform of the signal and T is the total duration (in time) of the signal (if your signal x(t) were instead a random process, we'd have to take an ensemble average over many realizations of the system...). The total power in the signal is simply the integral of S_{xx} over the system's frequency bandwidth. Using your code above, we can write

import scipy.signal

# Estimate PSD `S_xx_welch` at discrete frequencies `f_welch`
f_welch, S_xx_welch = scipy.signal.welch(x, fs=fs)

# Integrate PSD over spectral bandwidth
# to obtain signal power `P_welch`
df_welch = f_welch[1] - f_welch[0]
P_welch = np.sum(S_xx_welch) * df_welch

To make contact with your np.fft.fft(...) computations (which return the DFT), we must use the information from the previous section, namely that

X[k] = X_{dt}(f_k) = (1 / dt) * X(f_k)

Thus, to compute the power spectral density (or total power) from the FFT computations, we need to recognize that

S_{xx} = |X[k]|^2 * (dt ^ 2) / T

# Compute DFT
Xk = np.fft.fft(x)

# Compute corresponding frequencies
dt = time[1] - time[0]
f_fft = np.fft.fftfreq(len(x), d=dt)

# Estimate PSD `S_xx_fft` at discrete frequencies `f_fft`
T = time[-1] - time[0]
S_xx_fft = ((np.abs(Xk) * dt) ** 2) / T

# Integrate PSD over spectral bandwidth to obtain signal power `P_fft`
df_fft = f_fft[1] - f_fft[0]
P_fft = np.sum(S_xx_fft) * df_fft

Your values for P_welch and P_fft should be very close to each other, as well as close to the expected power in the signal, which can be computed as

# Power in sinusoidal signal is simply squared RMS, and
# the RMS of a sinusoid is the amplitude divided by sqrt(2).
# Thus, the sinusoidal contribution to expected power is
P_exp = (amp / np.sqrt(2)) ** 2 

# For white noise, as is considered in this example,
# the noise is simply the noise PSD (a constant)
# times the system bandwidth. This was already
# computed in the problem statement and is given
# as `noise_power`. Simply add to `P_exp` to get
# total expected signal power.
P_exp += noise_power

Note: P_welch and P_fft will not be exactly equal, and likely not even equal within numerical accuracy. This is attributable to the fact that there are random errors associated with the estimation of the power spectral density. In an effort to reduce such errors, Welch's method splits your signal into several segments (the size of which is controlled by the nperseg keyword), computes the PSD of each segment, and averages the PSDs to obtain a better estimate of the signal's PSD (the more segments averaged over, the less the resulting random error). The FFT method, in effect, is equivalent to only computing and averaging over one, large segment. Thus, we expect some differences between P_welch and P_fft, but we should expect that P_welch is more accurate.

Signal Energy

As you stated, the signal energy can be obtained from the discrete version of Parseval's theorem

# Energy obtained via "integrating" over time
E = np.sum(x ** 2)

# Energy obtained via "integrating" DFT components over frequency.
# The fact that `E` = `E_fft` is the statement of 
# the discrete version of Parseval's theorem.
N = len(x)
E_fft = np.sum(np.abs(Xk) ** 2) / N

We'd like to now understand how S_xx_welch, computed above via scipy.signal.welch(...), relates to the total energy E in the signal. From above, S_xx_fft = ((np.abs(Xk) * dt) ** 2) / T. Rearranging the terms in this expression, we see that np.abs(Xk) ** 2 = (T / (dt ** 2)) * S_xx_fft. Further,

From above, we know that np.sum(S_xx_fft) = P_fft / df_fft and that P_fft and P_welch are approximately equal. Further, P_welch = np.sum(S_xx_welch) / df_welch so that we obtain

np.sum(S_xx_fft) = (df_welch / df_fft) * np.sum(S_xx_welch)

Further, S_xx_fft = ((np.abs(Xk) * dt) ** 2) / T. Substituting S_xx_fft into the equation above and rearranging terms, we arrive at

np.sum(np.abs(Xk) ** 2) = (T / (dt ** 2)) * (df_welch / df_fft) * np.sum(S_xx_welch)

The left-hand side (LHS) in the above equation should now look very close to the expression for the total energy in the signal as computed from the DFT components. Now, note that T / dt = N, where N is the number of sample points in your signal. Dividing through by N, we now have a LHS that is, by definition, equal to the E_fft computed above. Thus, we can obtain the total energy in the signal from Welch's PSD via

# Signal energy from Welch's PSD
E_welch = (1. / dt) * (df_welch / df_fft) * np.sum(S_xx_welch)

E, E_fft, and E_welch should all be very close in value :) As discussed at the end of the preceding section, we do expect some slight differences between E_welch compared to E and E_fft, but this is attributable to the fact that values derived from Welch's method have reduced random error (i.e. are more accurate).