Using scipy.optimize.curve_fit with weights

xnx picture xnx · Dec 29, 2014 · Viewed 11.4k times · Source

According to the documentation, the argument sigma can be used to set the weights of the data points in the fit. These "describe" 1-sigma errors when the argument absolute_sigma=True.

I have some data with artificial normally-distributed noise which varies:

n = 200
x = np.linspace(1, 20, n)
x0, A, alpha = 12, 3, 3

def f(x, x0, A, alpha):
    return A * np.exp(-((x-x0)/alpha)**2)

noise_sigma = x/20
noise = np.random.randn(n) * noise_sigma
yexact = f(x, x0, A, alpha)
y = yexact + noise

If I want to fit the noisy y to f using curve_fit to what should I set sigma? The documentation isn't very specific here, but I would usually use 1/noise_sigma**2 as the weight:

p0 = 10, 4, 2
popt, pcov = curve_fit(f, x, y, p0)
popt2, pcov2 = curve_fit(f, x, y, p0, sigma=1/noise_sigma**2, absolute_sigma=True)

It doesn't seem to improve the fit much, though.

enter image description here

Is this option only used to better interpret the fit uncertainties through the covariance matrix? What is the difference between these two telling me?

In [249]: pcov
Out[249]: 
array([[  1.10205238e-02,  -3.91494024e-08,   8.81822412e-08],
       [ -3.91494024e-08,   1.52660426e-02,  -1.05907265e-02],
       [  8.81822412e-08,  -1.05907265e-02,   2.20414887e-02]])

In [250]: pcov2
Out[250]: 
array([[ 0.26584674, -0.01836064, -0.17867193],
       [-0.01836064,  0.27833   , -0.1459469 ],
       [-0.17867193, -0.1459469 ,  0.38659059]])

Answer

user545424 picture user545424 · Dec 15, 2018

At least with scipy version 1.1.0 the parameter sigma should be equal to the error on each parameter. Specifically the documentation says:

A 1-d sigma should contain values of standard deviations of errors in ydata. In this case, the optimized function is chisq = sum((r / sigma) ** 2).

In your case that would be:

curve_fit(f, x, y, p0, sigma=noise_sigma, absolute_sigma=True)

I looked through the source code and verified that when you specify sigma this way it minimizes ((f-data)/sigma)**2.

As a side note, this is in general what you want to be minimizing when you know the errors. The likelihood of observing points data given a model f is given by:

L(data|x0,A,alpha) = product over i Gaus(data_i, mean=f(x_i,x0,A,alpha), sigma=sigma_i)

which if you take the negative log becomes (up to constant factors that don't depend on the parameters):

-log(L) = sum over i (f(x_i,x0,A,alpha)-data_i)**2/(sigma_i**2)

which is just the chisquare.

I wrote a test program to verify that curve_fit was indeed returning the correct values with the sigma specified correctly:

from __future__ import print_function
import numpy as np
from scipy.optimize import curve_fit, fmin

np.random.seed(0)

def make_chi2(x, data, sigma):
    def chi2(args):
        x0, A, alpha = args
        return np.sum(((f(x,x0,A,alpha)-data)/sigma)**2)
    return chi2

n = 200
x = np.linspace(1, 20, n)
x0, A, alpha = 12, 3, 3

def f(x, x0, A, alpha):
    return A * np.exp(-((x-x0)/alpha)**2)

noise_sigma = x/20
noise = np.random.randn(n) * noise_sigma
yexact = f(x, x0, A, alpha)
y = yexact + noise

p0 = 10, 4, 2

# curve_fit without parameters (sigma is implicitly equal to one)
popt, pcov = curve_fit(f, x, y, p0)
# curve_fit with wrong sigma specified
popt2, pcov2 = curve_fit(f, x, y, p0, sigma=1/noise_sigma**2, absolute_sigma=True)
# curve_fit with correct sigma
popt3, pcov3 = curve_fit(f, x, y, p0, sigma=noise_sigma, absolute_sigma=True)

chi2 = make_chi2(x,y,noise_sigma)

# double checking that we get the correct answer
xopt = fmin(chi2,p0,xtol=1e-10,ftol=1e-10)

print("popt  = %s, chi2 = %.2f" % (popt,chi2(popt)))
print("popt2 = %s, chi2 = %.2f" % (popt2, chi2(popt2)))
print("popt3 = %s, chi2 = %.2f" % (popt3, chi2(popt3)))
print("xopt  = %s, chi2 = %.2f" % (xopt, chi2(xopt)))

which outputs:

popt  = [ 11.93617403   3.30528488   2.86314641], chi2 = 200.66
popt2 = [ 11.94169083   3.30372955   2.86207253], chi2 = 200.64
popt3 = [ 11.93128545   3.333727     2.81403324], chi2 = 200.44
xopt  = [ 11.93128603   3.33373094   2.81402741], chi2 = 200.44

As you can see the chi2 is indeed minimized correctly when you specify sigma=sigma as an argument to curve_fit.

As to why the improvement isn't "better", I'm not really sure. My only guess is that without specifying a sigma value you implicitly assume they are equal and over the part of the data where the fit matters (the peak), the errors are "approximately" equal.

To answer your second question, no the sigma option is not only used to change the output of the covariance matrix, it actually changes what is being minimized.