How does Lru_cache (from functools) Work?

softwareflow picture softwareflow · Apr 17, 2018 · Viewed 45.1k times · Source

Especially when using recursive code there are massive improvements with lru_cache. I do understand that a cache is a space that stores data that has to be served fast and saves the computer from recomputing.

How does the Python lru_cache from functools work internally?

I'm Looking for a specific answer, does it use dictionaries like the rest of Python? Does it only store the return value?

I know that Python is heavily built on top of dictionaries, however, I couldn't find a specific answer to this question. Hopefully, someone can simplify this answer for all the users on StackOverflow.

Answer

ndpu picture ndpu · Apr 17, 2018

The functools source code is available here: https://github.com/python/cpython/blob/master/Lib/functools.py

lru_cache uses the _lru_cache_wrapper decorator (python decorator with arguments pattern) which has a cache dictionary in context in which it saves the return value of the function called (every decorated function will have its own cache dict). The dictionary key is generated with the _make_key function from the arguments. Added some bold comments below:

# ACCORDING TO PASSED maxsize ARGUMENT _lru_cache_wrapper
# DEFINES AND RETURNS ONE OF wrapper DECORATORS

def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo):
    # Constants shared by all lru cache instances:
    sentinel = object()      # unique object used to signal cache misses

    cache = {}                                # RESULTS SAVES HERE
    cache_get = cache.get    # bound method to lookup a key or return None

    # ... maxsize is None:

    def wrapper(*args, **kwds):
        # Simple caching without ordering or size limit
        nonlocal hits, misses
        key = make_key(args, kwds, typed)     # BUILD A KEY FROM ARGUMENTS
        result = cache_get(key, sentinel)     # TRYING TO GET PREVIOUS CALLS RESULT
        if result is not sentinel:            # ALREADY CALLED WITH PASSED ARGS
            hits += 1
            return result                     # RETURN SAVED RESULT
                                              # WITHOUT ACTUALLY CALLING FUNCTION
        misses += 1
        result = user_function(*args, **kwds) # FUNCTION CALL - if cache[key] empty
        cache[key] = result                   # SAVE RESULT

        return result
    # ...

    return wrapper