I have been trying to fit a linear model to a set of stress/strain data by minimizing chi-squared. Unfortunately using the code below is not correctly minimizing the chisqfunc
function. It is finding the minimum at the initial conditions, x0
, which is not correct. I have looked through the scipy.optimize
documentation and tested minimizing other functions which has worked correctly. Could you please suggest how to fix the code below or suggest another method I can use to fit a linear model to data by minimizing chi-squared?
import numpy
import scipy.optimize as opt
filename = 'data.csv'
data = numpy.loadtxt(open(filename,"r"),delimiter=",")
stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]
def chisqfunc((a, b)):
model = a + b*strain
chisq = numpy.sum(((stress - model)/err_stress)**2)
return chisq
x0 = numpy.array([0,0])
result = opt.minimize(chisqfunc, x0)
print result
Thank you for reading my question and any help would be greatly appreciated.
Cheers, Will
EDIT: Data set I am currently using: Link to data
The problem is that your initial guess is very far from the actual solution. If you add a print statement inside chisqfunc()
like print (a,b)
, and rerun your code, you'll get something like:
(0, 0)
(1.4901161193847656e-08, 0.0)
(0.0, 1.4901161193847656e-08)
This means that minimize
evaluates the function only at these points.
if you now try to evaluate chisqfunc()
at these 3 pairs of values, you'll see that they EXACTLY match, for example
print chisqfunc((0,0))==chisqfunc((1.4901161193847656e-08,0))
True
This happens because of rounding floating points arithmetics. In other words, when evaluating stress - model
, the var stress
is too many order of magnitude larger than model
, and the result is truncated.
One could then just try bruteforcing it, increasing floating point precision, with writing data=data.astype(np.float128)
just after loading the data with loadtxt
. minimize
fails, with result.success=False
, but with a helpful message
Desired error not necessarily achieved due to precision loss.
One possibility is then to provide a better initial guess, so that in the subtraction stress - model
the model
part is of the same order of magnitude, the other to rescale the data, so that the solution will be closer to your initial guess (0,0)
.
It is MUCH better if you just rescale the data, making for example nondimensional with respect to a certain stress value (like the yelding/cracking of this material)
This is an example of the fitting, using as a stress scale the maximum measured stress. There are very few changes from your code:
import numpy
import scipy.optimize as opt
filename = 'data.csv'
data = numpy.loadtxt(open(filename,"r"),delimiter=",")
stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]
smax = stress.max()
stress = stress/smax
#I am assuming the errors err_stress are in the same units of stress.
err_stress = err_stress/smax
def chisqfunc((a, b)):
model = a + b*strain
chisq = numpy.sum(((stress - model)/err_stress)**2)
return chisq
x0 = numpy.array([0,0])
result = opt.minimize(chisqfunc, x0)
print result
assert result.success==True
a,b=result.x*smax
plot(strain,stress*smax)
plot(strain,a+b*strain)
Your linear model is quite good, i.e. your material has a very linear behaviour for this range of deformation (what material is it anyway?):