trying to get reasonable values from scipy powerlaw fit

zje picture zje · Apr 16, 2012 · Viewed 9k times · Source

I'm trying to fit some data from a simulation code I've been running in order to figure out a power law dependence. When I plot a linear fit, the data does not fit very well.

Here's the python script I'm using to fit the data:

#!/usr/bin/env python
from scipy import optimize
import numpy

xdata=[ 0.00010851,  0.00021701,  0.00043403,  0.00086806,  0.00173611, 0.00347222]
ydata=[ 29.56241016,  29.82245508,  25.33930469,  19.97075977,  12.61276074, 7.12695312]

fitfunc = lambda p, x: p[0] + p[1] * x ** (p[2])
errfunc = lambda p, x, y: (y - fitfunc(p, x))

out,success = optimize.leastsq(errfunc, [1,-1,-0.5],args=(xdata, ydata),maxfev=3000)

print "%g + %g*x^%g"%(out[0],out[1],out[2])

the output I get is: -71205.3 + 71174.5*x^-9.79038e-05

While on the plot the fit looks about as good as you'd expect from a leastsquares fit, the form of the output bothers me. I was hoping the constant would be close to where you'd expect the zero to be (around 30). And I was expecting to find a power dependence of a larger fraction than 10^-5.

I've tried rescaling my data and playing with the parameters to optimize.leastsq with no luck. Is what I'm trying to accomplish possible or does my data just not allow it? The calculation is expensive, so getting more data points is non-trivial.

Thanks!

Answer

fraxel picture fraxel · Apr 17, 2012

It is much better to first take the logarithm, then use leastsquare to fit to this linear equation, which will give you a much better fit. There is a great example in the scipy cookbook, which I've adapted below to fit your code.

The best fits like this are: amplitude = 0.8955, and index = -0.40943265484

As we can see from the graph (and your data), if its a power law fit we would not expect the amplitude value to be near 30. As in the power law equation f(x) == Amp * x ** index, so with a negative index: f(1) == Amp and f(0) == infinity.

enter image description here

from pylab import *
from scipy import *
from scipy import optimize

xdata=[ 0.00010851,  0.00021701,  0.00043403,  0.00086806,  0.00173611, 0.00347222]
ydata=[ 29.56241016,  29.82245508,  25.33930469,  19.97075977,  12.61276074, 7.12695312]

logx = log10(xdata)
logy = log10(ydata)

# define our (line) fitting function
fitfunc = lambda p, x: p[0] + p[1] * x
errfunc = lambda p, x, y: (y - fitfunc(p, x))

pinit = [1.0, -1.0]
out = optimize.leastsq(errfunc, pinit,
                       args=(logx, logy), full_output=1)

pfinal = out[0]
covar = out[1]

index = pfinal[1]
amp = 10.0**pfinal[0]

print 'amp:',amp, 'index', index

powerlaw = lambda x, amp, index: amp * (x**index)
##########
# Plotting data
##########
clf()
subplot(2, 1, 1)
plot(xdata, powerlaw(xdata, amp, index))     # Fit
plot(xdata, ydata)#, yerr=yerr, fmt='k.')  # Data
text(0.0020, 30, 'Ampli = %5.2f' % amp)
text(0.0020, 25, 'Index = %5.2f' % index)
xlabel('X')
ylabel('Y')

subplot(2, 1, 2)
loglog(xdata, powerlaw(xdata, amp, index))
plot(xdata, ydata)#, yerr=yerr, fmt='k.')  # Data
xlabel('X (log scale)')
ylabel('Y (log scale)')

savefig('power_law_fit.png')
show()