Python - Minimizing Chi-squared

Will282 picture Will282 · Mar 4, 2014 · Viewed 12.1k times · Source

I have been trying to fit a linear model to a set of stress/strain data by minimizing chi-squared. Unfortunately using the code below is not correctly minimizing the chisqfunc function. It is finding the minimum at the initial conditions, x0, which is not correct. I have looked through the scipy.optimize documentation and tested minimizing other functions which has worked correctly. Could you please suggest how to fix the code below or suggest another method I can use to fit a linear model to data by minimizing chi-squared?

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result

Thank you for reading my question and any help would be greatly appreciated.

Cheers, Will

EDIT: Data set I am currently using: Link to data

Answer

gg349 picture gg349 · Mar 5, 2014

The problem is that your initial guess is very far from the actual solution. If you add a print statement inside chisqfunc() like print (a,b), and rerun your code, you'll get something like:

(0, 0)
(1.4901161193847656e-08, 0.0)
(0.0, 1.4901161193847656e-08)

This means that minimize evaluates the function only at these points.

if you now try to evaluate chisqfunc() at these 3 pairs of values, you'll see that they EXACTLY match, for example

print chisqfunc((0,0))==chisqfunc((1.4901161193847656e-08,0))
True

This happens because of rounding floating points arithmetics. In other words, when evaluating stress - model, the var stress is too many order of magnitude larger than model, and the result is truncated.

One could then just try bruteforcing it, increasing floating point precision, with writing data=data.astype(np.float128) just after loading the data with loadtxt. minimize fails, with result.success=False, but with a helpful message

Desired error not necessarily achieved due to precision loss.

One possibility is then to provide a better initial guess, so that in the subtraction stress - model the model part is of the same order of magnitude, the other to rescale the data, so that the solution will be closer to your initial guess (0,0).

It is MUCH better if you just rescale the data, making for example nondimensional with respect to a certain stress value (like the yelding/cracking of this material)

This is an example of the fitting, using as a stress scale the maximum measured stress. There are very few changes from your code:

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]


smax = stress.max()
stress = stress/smax
#I am assuming the errors err_stress are in the same units of stress.
err_stress = err_stress/smax

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result
assert result.success==True
a,b=result.x*smax
plot(strain,stress*smax)
plot(strain,a+b*strain)

Your linear model is quite good, i.e. your material has a very linear behaviour for this range of deformation (what material is it anyway?): enter image description here