By what mechanism is this fibonacci-function memoized?
fib = (map fib' [0..] !!)
where fib' 1 = 1
fib' 2 = 1
fib' n = fib (n-2) + fib (n-1)
And on a related note, why is this version not?
fib n = (map fib' [0..] !! n)
where fib' 1 = 1
fib' 2 = 1
fib' n = fib (n-2) + fib (n-1)
The evaluation mechanism in Haskell is by-need: when a value is needed, it is calculated, and kept ready in case it is asked for again. If we define some list, xs=[0..]
and later ask for its 100th element, xs!!99
, the 100th slot in the list gets "fleshed out", holding the number 99
now, ready for next access.
That is what that trick, "going-through-a-list", is exploiting. In normal doubly-recursve Fibonacci definition, fib n = fib (n-1) + fib (n-2)
, the function itself gets called, twice from the top, causing the exponential explosion. But with that trick, we set out a list for the interim results, and go "through the list":
fib n = (xs!!(n-1)) + (xs!!(n-2)) where xs = 0:1:map fib [2..]
The trick is to cause that list to get created, and cause that list not to go away (by way of garbage collection) between calls to fib
. The easiest way to achieve this, is to name that list. "If you name it, it will stay."
Your first version defines a monomorphic constant, and second defines a polymorphic function. A polymorphic function can't use the same internal list for different types it might need to serve, so no sharing, i.e. no memoization.
With the first version, compiler is being generous with us, taking out that constant subexpression (map fib' [0..]
) and making it a separate shareable entity, but it's not under any obligation to do so. and there are actually cases where we don't want it to do that for us automatically.
(edit:) Consider these re-writes:
fib1 = f fib2 n = f n fib3 n = f n
where where where
f i = xs !! i f i = xs !! i f i = xs !! i
xs = map fib' [0..] xs = map fib' [0..] xs = map fib' [0..]
fib' 1 = 1 fib' 1 = 1 fib' 1 = 1
fib' 2 = 1 fib' 2 = 1 fib' 2 = 1
fib' i=fib1(i-2)+fib1(i-1) fib' i=fib2(i-2)+fib2(i-1) fib' i=f(i-2)+f(i-1)
So the real story seems to be about the nested scope definitions. There is no outer scope with the 1st definition, and the 3rd is careful not to call the outer-scope fib3
, but the same-level f
.
Each new invocation of fib2
seems to create its nested definitions anew because any of them could (in theory) be defined differently depending on the value of n
(thanks to Vitus and Tikhon for pointing that out). With the first defintion there's no n
to depend on, and with the third there is a dependency, but each separate call to fib3
calls into f
which is careful to only call definitions from same-level scope, internal to this specific invocation of fib3
, so the same xs
gets reused (i.e. shared) for that invocation of fib3
.
But nothing precludes the compiler from recognizing that the internal definitions in any of the versions above are in fact independent of the outer n
binding, to perform the lambda lifting after all, resulting in full memoization (except for the polymorphic definitions). In fact that's exactly what happens with all three versions when declared with monomorphic types and compiled with -O2 flag. With polymorphic type declarations, fib3
exhibits local sharing and fib2
no sharing at all.
Ultimately, depending on a compiler, and compiler optimizations used, and how you test it (loading files in GHCI, compiled or not, with -O2 or not, or standalone), and whether it gets a monomorphic or a polymorphic type the behaviour might change completely - whether it exhibits local (per-call) sharing (i.e. linear time on each call), memoization (i.e. linear time on first call, and 0 time on subsequent calls with same or smaller argument), or no sharing at all (exponential time).
Short answer is, it's a compiler thing. :)