NumPy Histogram - ValueError range parameter must be finite - input array is okay

Filippo Antonio Capizzi picture Filippo Antonio Capizzi · Apr 5, 2017 · Viewed 9.1k times · Source

I'm struggling to understand this error, since I'll give you an example that's working and the one I'm interested in that's not.

I have to analyse a set of data with hourly prices for an entire year in it, called sys_prices, which - after various transformations - is a numpy.ndarray object with 8785 rows (1 column), and every row is a numpy.ndarray item with only one element, a numpy.float64 number.

The code not working is the following:

stop_day = 95
start_day = stop_day - 10 # 10 days before
stop_day = (stop_day-1)*24
start_day = (start_day-1)*24

pcs=[] # list of prices to analyse
for ii in range(start_day, stop_day):
    pcs.append(sys_prices[ii][0])

p, x = np.histogram(pcs, bins='fd') 

The *24 part is to tune the index within the dataset so that to respect the hourly resolution.

What I expect is to supply the list pcs to the histogram method, so that to get the values of my histogram and bin edges into p and x, respectively.

I say that I expect this because the following code works:

start_day = 1 
start_month = 1 
start_year = 2016 
stop_day = 1
stop_month = 2 
stop_year = 2016
num_prices = (date(stop_year, stop_month, stop_day) - date(start_year, start_month, start_day)).days*24

jan_prices = []
for ii in range(num_prices):
    jan_prices.append(sys_prices[ii][0])

p, x = np.histogram(jan_prices, bins='fd') # bin the data`

The difference in the codes is that the working one is analyzing only 10 days within an arbitrary period starting backwards from a chosen day of the year, while the working example uses all the prices in the month of January (eg. the first 744 values of the dataset).

Strange(r) thing: I used different values for stop_day, and it seems that 95 raises the error, while 99 or 100 or 200 don't.

Could you help me?

Answer

Filippo Antonio Capizzi picture Filippo Antonio Capizzi · Apr 6, 2017

I solved it, there was a single NaN in the dataset I couldn't spot.

For those wondering how to spot it, I just used this code to find the index of the item:

nanlist=[]
for ii in range(len(array)):
    if numpy.isnan(array[ii]):
        nanlist.append(ii)

Where array is your container.