When to use memoization in Ruby on Rails

Gdeglin picture Gdeglin · Mar 30, 2009 · Viewed 13.5k times · Source

In mid July 2008 Memoization was added to Rails core. A demonstration of the usage is here.

I have not been able to find any good examples on when methods should be memoized, and the performance implications of each. This blog post, for example, suggests that oftentimes, memoization should not be used at all.

For something that could potentially have tremendous performance implications, there seem to be few resources that go beyond providing a simple tutorial.

Has anyone seen memoization used in their own projects? What factors would make you consider memoizing a method?


After doing some more research on my own I found that memoization is used a remarkable number of times inside of Rails core.

Here's an example: http://github.com/rails/rails/blob/1182658e767d2db4a46faed35f0b1075c5dd9a88/actionpack/lib/action_view/template.rb.

This usage seems to go against the findings of the blog post above that found memoization can hurt performance.

Answer

Theo picture Theo · Dec 4, 2009

I think many Rails developers don't fully understand what memoization does and how it works. I've seen it applied to methods that return lazy loaded collections (like a Sequel dataset), or applied to methods that take no arguments but calculate something based on instance variables. In the first case the memoization is nothing but overhead, and in the second it's a source of nasty and hard to track down bugs.

I would not apply memoization if

  • the returned value is merely slightly expensive to calculate. It would have to be very expensive, and not further optimizable, for it to be worth memoization.
  • the returned value is or could be lazy loaded
  • the method is not a pure function, i.e. it is guaranteed to return exactly the same value for the same arguments -- and only uses the arguments to do it's work, or other pure functions. Using instance variables or calling methods that in turn uses instance variables means that the method could return different results for the same arguments.

There are other situations too where memoization isn't appropriate, such as the one in the question and the answers above, but these are three that I think aren't as obvious.

The last item is probably the most important: memoization caches a result based on the arguments to the method, if the method looks like this it cannot be memoized:

def unmemoizable1(name)
  "%s was here %s" % name, Time.now.strftime('%Y-%m-%d')
end

def unmemoizable2
  find_by_shoe_size(@size)
end

Both can, however, be rewritten to take advantage of memoization (although in these two cases it should obviously not be done for other reasons):

def unmemoizable1(name)
  memoizable1(name, Time.now.strftime('%Y-%m-%d'))
end

def memoizable1(name, time)
  "#{name} was here #{time}"
end
memoize :memoizable1

def unmemoizable2
  memoizable2(@size)
end

def memoizable2(size)
  find_by_shoe_size(size)
end
memoize :memoizable2

(assuming that find_by_shoe_size didn't have, or relied on, any side effects)

The trick is to extract a pure function from the method and apply memoization to that instead.