This weekend I decided to try my hand at some Scala and Clojure. I'm proficient with object oriented programming, and so Scala was easy to pick up as a language, but wanted to try out functional programming. This is where it got hard.
I just can't seem to get my head into a mode of writing functions. As an expert functional programmer, how do you approach a problem?
Given a list of values and a defined period of summation, how would you generate a new list of the simple moving average of the list?
For example: Given the list values
(2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0), and the period
4, the function should return: (0.0, 0.0, 0.0, 4.75, 5.0, 6.0, 7.25, 8.0, 8.25, 6.5)
After spending a day mulling it over, the best I could come up with in Scala was this:
def simpleMovingAverage(values: List[Double], period: Int): List[Double] = {
(for (i <- 1 to values.length)
yield
if (i < period) 0.00
else values.slice(i - period, i).reduceLeft(_ + _) / period).toList
}
I know this is horribly inefficient, I'd much rather do something like:
where n < period: ma(n) = 0
where n = period: ma(n) = sum(value(1) to value(n)) / period
where n > period: man(n) = ma(n -1) - (value(n-period) / period) + (value(n) / period)
Now that would be easily done in a imperative style, but I can't for the life of me work out how to express that functionally.
Interesting problem. I can think of many solutions, with varying degrees of efficiency. Having to add stuff repeatedly isn't really a performance problem, but let's assume it is. Also, the zeroes at the beginning can be prepended later, so let's not worry about producing them. If the algorithm provides them naturally, fine; if not, we correct it later.
Starting with Scala 2.8, the following would give the result for n >= period
by using sliding
to get a sliding window of the List:
def simpleMovingAverage(values: List[Double], period: Int): List[Double] =
List.fill(period - 1)(0.0) ::: (values sliding period map (_.sum) map (_ / period))
Nevertheless, although this is rather elegant, it doesn't have the best performance possible, because it doesn't take advantage of already computed additions. So, speaking of them, how can we get them?
Let's say we write this:
values sliding 2 map sum
We have a list of the sum of each two pairs. Let's try to use this result to compute the moving average of 4 elements. The above formula made the following computation:
from d1, d2, d3, d4, d5, d6, ...
to (d1+d2), (d2+d3), (d3+d4), (d4+d5), (d5+d6), ...
So if we take each element and add it to the second next element, we get the moving average for 4 elements:
(d1+d2)+(d3+d4), (d2+d3)+(d4+d5), (d3+d4)+(d5+d6), ...
We may do it like this:
res zip (res drop 2) map Function.tupled(_+_)
We could then compute the moving average for 8 elements, and so on. Well, there is a well known algorithm to compute things that follow such pattern. It's most known for its use on computing the power of a number. It goes like this:
def power(n: Int, e: Int): Int = e match {
case 0 => 1
case 1 => n
case 2 => n * n
case odd if odd % 2 == 1 => power(n, (odd - 1)) * n
case even => power(power(n, even / 2), 2)
}
So, let's apply it here:
def movingSum(values: List[Double], period: Int): List[Double] = period match {
case 0 => throw new IllegalArgumentException
case 1 => values
case 2 => values sliding 2 map (_.sum)
case odd if odd % 2 == 1 =>
values zip movingSum(values drop 1, (odd - 1)) map Function.tupled(_+_)
case even =>
val half = even / 2
val partialResult = movingSum(values, half)
partialResult zip (partialResult drop half) map Function.tupled(_+_)
}
So, here's the logic. Period 0 is invalid, period 1 is equal to the input, period 2 is sliding window of size 2. If greater than that, it may be even or odd.
If odd, we add each element to the movingSum
of the next (odd - 1)
elements. For example, if 3, we add each element to the movingSum
of the next 2 elements.
If even, we compute the movingSum
for n / 2
, then add each element to the one n / 2
steps afterwards.
With that definition, we can then go back to the problem and do this:
def simpleMovingAverage(values: List[Double], period: Int): List[Double] =
List.fill(period - 1)(0.0) ::: (movingSum(values, period) map (_ / period))
There's a slight inefficiency with regards to the use of :::
, but it's O(period), not O(values.size). It can be made more efficient with a tail recursive function. And, of course, the definition of "sliding" I provided is horrendous performance-wise, but there will be a much better definition of it on Scala 2.8. Note that we can't make an efficient sliding
method on a List
, but we can do it on an Iterable
.
Having said all that, I'd go with the very first definition, and optimize only if a critical path analysis pinpointed this as a big deal.
To conclude, let's consider how I went about the problem. We have a moving average problem. A moving average is the sum of a moving "window" on a list, divided by the size of that window. So, first, I try to get a sliding window, sum everything on it, and then divide by the size.
The next problem was to avoid repetition of already computed additions. In this case, I went to the smallest addition possible, and tried to figure out how to compute bigger sums reusing such results.
Finally, let's try to solve the problem the way you figured it, by adding and subtracting from the previous result. Getting the first average is easy:
def movingAverage(values: List[Double], period: Int): List[Double] = {
val first = (values take period).sum / period
Now we make two lists. First, the list of elements to be subtracted. Next, the list of elements to be added:
val subtract = values map (_ / period)
val add = subtract drop period
We can add these two lists by using zip
. This method will only produce as many elements as the smaller list has, which avoids the problem of subtract
being bigger than necessary:
val addAndSubtract = add zip subtract map Function.tupled(_ - _)
We finish by composing the result with a fold:
val res = (addAndSubtract.foldLeft(first :: List.fill(period - 1)(0.0)) {
(acc, add) => (add + acc.head) :: acc
}).reverse
which is the answer to be returned. The whole function looks like this:
def movingAverage(values: List[Double], period: Int): List[Double] = {
val first = (values take period).sum / period
val subtract = values map (_ / period)
val add = subtract drop period
val addAndSubtract = add zip subtract map Function.tupled(_ - _)
val res = (addAndSubtract.foldLeft(first :: List.fill(period - 1)(0.0)) {
(acc, add) => (add + acc.head) :: acc
}).reverse
res
}