I've been trying to use this implementation of the Holt-Winters algorithm for time series forecasting in Python but have run into a roadblock... basically, for some series of (positive) inputs, it sometimes forecasts negative numbers, which should clearly not be the case. Even if the forecasts are not negative, they are sometimes wildly inaccurate - orders of magnitude higher/lower than they should be. Giving the algorithm more periods of data to work with does not appear to help, and in fact often makes the forecast worse.
The data I'm using has the following characteristics, which might be problems:
Very frequently sampled (one data point every 15 minutes, as opposed to monthly data as the example uses) - but from what I've read, the Holt-Winters algorithm shouldn't have a problem with that. Perhaps that indicates a problem with the implementation?
Has multiple periodicities - there are daily peaks (i.e. every 96 data points) as well as a weekly cycle of weekend data being significantly lower than weekday data - for example weekdays can peak around 4000 but weekends peak at 1000 - but even when I only give it weekday data, I run into the negative-number problem.
Is there something I'm missing with either the implementation or my usage of the Holt-Winters algorithm in general? I'm not a statistician so I'm using the 'default' values of alpha, beta, and gamma indicated in the link above - is that likely to be the problem, and is there a better way to calculate those values?
Or ... is there a better algorithm to use here than Holt-Winters? Ultimately I just want to create sensible forecasts from historical data here. I've tried single- and double-exponential smoothing but (as far as I understand) neither support periodicity in data.
Any help/input would be greatly appreciated!
I tried generating random data until I got interesting results. Here I fed in all positive numbers and got negative forecasts:
y = [0.92, 0.78, 0.92, 0.61, 0.47, 0.4, 0.59, 0.13, 0.27, 0.31, 0.24, 0.01]
holtwinters(y, 0.2, 0.1, 0.05, 4)
...
forecast: -0.104857182966
forecast: -0.197407475203
forecast: -0.463988558577
forecast: -0.258023593197
but note that the forecast fits the negative slope of the data.
This might be the orders of magnitude you were talking about:
y = [0.1, 0.68, 0.15, 0.08, 0.94, 0.58, 0.35, 0.38, 0.7, 0.74, 0.93, 0.87]
holtwinters(y, 0.2, 0.1, 0.05, 4)
...
forecast: 1.93777559066
forecast: 3.11109138055
forecast: 0.910967977635
forecast: 0.684668348397
But I'm not sure how you'd deem it wildly inaccurate or judge that it "should be" lower.
Whenever you're extrapolating data, you're going to have somewhat surprising results. Are you concerned more that the implementation might be incorrect or that the output doesn't have good properties for your specific usage?