Difference between mel-spectrogram and an MFCC

monadoboi picture monadoboi · Dec 25, 2018 · Viewed 12k times · Source

I'm using the librosa library to convert music segments into mel-spectrograms to use as inputs for my neural network, as shown in the docs here.

How is this different from MFCCs, if at all? Are there any advantages or disadvantages to using either?

Answer

Jon Nordby picture Jon Nordby · Jan 23, 2019

To get MFCC, compute the DCT on the mel-spectrogram. The mel-spectrogram is often log-scaled before.

MFCC is a very compressible representation, often using just 20 or 13 coefficients instead of 32-64 bands in Mel spectrogram. The MFCC is a bit more decorrelarated, which can be beneficial with linear models like Gaussian Mixture Models. With lots of data and strong classifiers like Convolutional Neural Networks, mel-spectrogram can often perform better.