variance vs coefficient of variation

Bayes picture Bayes · Jan 15, 2016 · Viewed 7.2k times · Source

I need to identify which statistic let me to find on digital image which line has the highest variation. I am using Variance (square units, calculated as numpy.var(x)) and Coefficient of Variation (unitless, calculated as numpy.sd(x)/numpy.mean(x)), but I got different values, as here:

v1 = line(VAR(x)) 

v2 = line(CV(x))

print(v1,v2)

The result:

(12,17)

Should not both find the same line? Which one could be better to use in this case?

Answer

Sergey Bushmanov picture Sergey Bushmanov · Jan 15, 2016

Coefficient of variation and variance are not supposed to choose the same array on a random data. Coefficient of variation will be sensitive to both variance and the scale of your data, whereas variance will be geared towards variation in your data.

Please see the example:

import numpy as np

x = np.random.randn(10)
x1= x+10
np.var(x), np.std(x)/np.mean(x)

(2.0571740850649021, -2.2697110381499224)

np.var(x1), np.std(x1)/np.mean(x1)

(2.0571740850649016, 0.1531035017615747)

Which one to choose depends on your application, but I'm leaning towards variance in your case.