Can someone explain why scipy.integrate.quad gives different results for equally long ranges while integrating sin(X)?

batbrat picture batbrat · Feb 24, 2009 · Viewed 14.1k times · Source

I am trying to numerically integrate an arbitrary (known when I code) function in my program using numerical integration methods. I am using Python 2.5.2 along with SciPy's numerical integration package. In order to get a feel for it, i decided to try integrating sin(x) and observed this behavior-

>>> from math import pi
>>> from scipy.integrate import quad
>>> from math import sin
>>> def integrand(x):
...     return sin(x)
... 
>>> quad(integrand, -pi, pi)
(0.0, 4.3998892617846002e-14)
>>> quad(integrand, 0, 2*pi)
(2.2579473462709165e-16, 4.3998892617846002e-14)

I find this behavior odd because -
1. In ordinary integration, integrating over the full cycle gives zero.
2. In numerical integration, this (1) isn't necessarily the case, because you may just be approximating the total area under the curve.

In any case, either assuming 1 is True or assuming 2 is True, I find the behavior to be inconsistent. Either both integrations (-pi to pi and 0 to 2*pi) should return 0.0 (first value in the tuple is the result and the second is the error) or return 2.257...

Can someone please explain why this is happening? Is this really an inconsistency? Can someone also tell me if I am missing something really basic about numerical methods?

In any case, in my final application, I plan to use the above method to find the arc length of a function. If someone has experience in this area, please advise me on the best policy for doing this in Python.

Edit
Note
I already have the first differential values at all points in the range stored in an array.
Current error is tolerable.
End note

I have read Wikipaedia on this. As Dimitry has pointed out, I will be integrating sqrt(1+diff(f(x), x)^2) to get the Arc Length. What I wanted to ask was - is there a better approximation/ Best practice(?) / faster way to do this. If more context is needed, I'll post it separately/ post context here, as you wish.

Answer

physicsmichael picture physicsmichael · Feb 24, 2009

The quad function is a function from an old Fortran library. It works by judging by the flatness and slope of the function it is integrating how to treat the step size it uses for numerical integration in order to maximize efficiency. What this means is that you may get slightly different answers from one region to the next even if they're analytically the same.

Without a doubt both integrations should return zero. Returning something that is 1/(10 trillion) is pretty close to zero! The slight differences are due to the way quad is rolling over sin and changing its step sizes. For your planned task, quad will be all you need.

EDIT: For what you're doing I think quad is fine. It is fast and pretty accurate. My final statement is use it with confidence unless you find something that really has gone quite awry. If it doesn't return a nonsensical answer then it is probably working just fine. No worries.