Euclidean distance vs Pearson correlation vs cosine similarity?

TIMEX picture TIMEX · Dec 3, 2009 · Viewed 34.4k times · Source

Their goals are all the same: to find similar vectors. Which do you use in which situation? (any practical examples?)

Answer

dsimcha picture dsimcha · Dec 3, 2009

Pearson correlation and cosine similarity are invariant to scaling, i.e. multiplying all elements by a nonzero constant. Pearson correlation is also invariant to adding any constant to all elements. For example, if you have two vectors X1 and X2, and your Pearson correlation function is called pearson(), pearson(X1, X2) == pearson(X1, 2 * X2 + 3). This is a pretty important property because you often don't care that two vectors are similar in absolute terms, only that they vary in the same way.