`ValueError: A value in x_new is above the interpolation range.` - what other reasons than not ascending values?

durbachit picture durbachit · Aug 1, 2017 · Viewed 32.1k times · Source

I receive this error in scipy interp1d function. Normally, this error would be generated if the x was not monotonically increasing.

import scipy.interpolate as spi
def refine(coarsex,coarsey,step):
    finex = np.arange(min(coarsex),max(coarsex)+step,step)
    intfunc = spi.interp1d(coarsex, coarsey,axis=0)
    finey = intfunc(finex)
    return finex, finey

for num, tfile in enumerate(files):
    tfile = tfile.dropna(how='any')
    x = np.array(tfile['col1'])
    y = np.array(tfile['col2'])
    finex, finey = refine(x,y,0.01)

The code is correct, because it successfully worked on 6 data files and threw the error for the 7th. So there must be something wrong with the data. But as far as I can tell, the data increase all the way down. I am sorry for not providing an example, because I am not able to reproduce the error on an example.

There are two things that could help me:

  1. Some brainstorming - if the data are indeed monotonically increasing, what else could produce this error? Another hint, regarding the decimals, could be in this question, but I think my solution (the min and max of x) is robust enough to avoid it. Or isn't it?
  2. Is it possible (how?) to return the value of x_new and it's index when throwing the ValueError: A value in x_new is above the interpolation range. so that I could actually see where in the file is the problem?

UPDATE

So the problem is that, for some reason, max(finex) is larger than max(coarsex) (one is .x39 and the other is .x4). I hoped rounding the original values to 2 significant digits would solve the problem, but it didn't, it displays fewer digits but still counts with the undisplayed. What can I do about it?

Answer

saintsfan342000 picture saintsfan342000 · Aug 1, 2017

If you are running Scipy v. 0.17.0 or newer, then you can pass fill_value='extrapolate' to spi.interp1d, and it will extrapolate to accomadate these values of your's that lie outside the interpolation range. So define your interpolation function like so:

intfunc = spi.interp1d(coarsex, coarsey,axis=0, fill_value="extrapolate")

Be forewarned, however!

Depending on what your data looks like and the type on interpolation you are performing, the extrapolated values can be erroneous. This is especially true if you have noisy or non-monotonic data. In your case you might be ok because your x_new value is only slighly beyond your interpolation range.

Here's simple demonstration of how this feature can work nicely but also give erroneous results.

import scipy.interpolate as spi
import numpy as np

x = np.linspace(0,1,100)
y = x + np.random.randint(-1,1,100)/100
x_new = np.linspace(0,1.1,100)
intfunc = spi.interp1d(x,y,fill_value="extrapolate")
y_interp = intfunc(x_new)

import matplotlib.pyplot as plt
plt.plot(x_new,y_interp,'r', label='interp/extrap')
plt.plot(x,y, 'b--', label='data')
plt.legend()
plt.show()

enter image description here

So the interpolated portion (in red) worked well, but the extrapolated portion clearly fails to follow the otherwise linear trend in this data because of the noise. So have some understanding of your data and proceed with caution.