How should the interquartile range be calculated in Python?

d3pd picture d3pd · Dec 14, 2014 · Viewed 15.2k times · Source

I have a list of numbers [1, 2, 3, 4, 5, 6, 7] and I want to have a function to return the interquartile range of this list of numbers. The interquartile range is the difference between the upper and lower quartiles. I have attempted to calculate the interquartile range using NumPy functions and using Wolfram Alpha. I find all of the answers, from my manual one, to the NumPy one, tothe Wolfram Alpha, to be different. I do not know why this is.

My attempt in Python is as follows:

>>> a = numpy.array([1, 2, 3, 4, 5, 6, 7])
>>> numpy.percentile(a, 25)
2.5
>>> numpy.percentile(a, 75)
5.5
>>> numpy.percentile(a, 75) - numpy.percentile(a, 25) # IQR
3.0

My attempt in Wolfram Alpha is as follows:

So, I find that the values returned by NumPy and Wolfram Alpha for what I think are the first quartile, the third quartile and the interquartile range are not consistent. Why is this? What should I be doing in Python to calculate the interquartile range correctly?

As far as I am aware, the interquartile range of [1, 2, 3, 4, 5, 6, 7] should be the following:

median(5, 6, 7) - median(1, 2, 3) = 4.

Answer

warner121 picture warner121 · Dec 14, 2014

Version 1.9 of numpy features a handy 'interpolation' argument to help you get to 4.

a = numpy.array([1, 2, 3, 4, 5, 6, 7])
numpy.percentile(a, 75, interpolation='higher') - numpy.percentile(a, 25, interpolation='lower')