Especially when using recursive code there are massive improvements with lru_cache
. I do understand that a cache is a space that stores data that has to be served fast and saves the computer from recomputing.
How does the Python lru_cache
from functools work internally?
I'm Looking for a specific answer, does it use dictionaries like the rest of Python? Does it only store the return
value?
I know that Python is heavily built on top of dictionaries, however, I couldn't find a specific answer to this question. Hopefully, someone can simplify this answer for all the users on StackOverflow.
The functools
source code is available here: https://github.com/python/cpython/blob/master/Lib/functools.py
lru_cache
uses the _lru_cache_wrapper
decorator (python decorator with arguments pattern) which has a cache
dictionary in context in which it saves the return value of the function called (every decorated function will have its own cache dict). The dictionary key is generated with the _make_key
function from the arguments. Added some bold comments below:
# ACCORDING TO PASSED maxsize ARGUMENT _lru_cache_wrapper
# DEFINES AND RETURNS ONE OF wrapper DECORATORS
def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo):
# Constants shared by all lru cache instances:
sentinel = object() # unique object used to signal cache misses
cache = {} # RESULTS SAVES HERE
cache_get = cache.get # bound method to lookup a key or return None
# ... maxsize is None:
def wrapper(*args, **kwds):
# Simple caching without ordering or size limit
nonlocal hits, misses
key = make_key(args, kwds, typed) # BUILD A KEY FROM ARGUMENTS
result = cache_get(key, sentinel) # TRYING TO GET PREVIOUS CALLS RESULT
if result is not sentinel: # ALREADY CALLED WITH PASSED ARGS
hits += 1
return result # RETURN SAVED RESULT
# WITHOUT ACTUALLY CALLING FUNCTION
misses += 1
result = user_function(*args, **kwds) # FUNCTION CALL - if cache[key] empty
cache[key] = result # SAVE RESULT
return result
# ...
return wrapper