The Context
I have located a rather critical bug in Django Cache Machine that causes it's invalidation logic to lose its mind after a upgrading from Django 1.4 to 1.7.
The bug is localized to invocations of only()
on models that extend cache machine's CachingMixin
. It results in deep recursions that occasionally bust the stack, but otherwise create huge flush_lists
that cache machine uses for bi-directional invalidation for models in ForeignKey
relationships.
class MyModel(CachingMixin):
id = models.CharField(max_length=50, blank=True)
nickname = models.CharField(max_length=50, blank=True)
favorite_color = models.CharField(max_length=50, blank=True)
content_owner = models.ForeignKey(OtherModel)
m = MyModel.objects.only('id').all()
The Bug
The bug occurs in the following lines(https://github.com/jbalogh/django-cache-machine/blob/f827f05b195ad3fc1b0111131669471d843d631f/caching/base.py#L253-L254). In this case self
is a instance of MyModel
with a mix of deferred and undeferred attributes:
fks = dict((f, getattr(self, f.attname)) for f in self._meta.fields
if isinstance(f, models.ForeignKey))
Cache Machine does bidirectional invalidation across ForeignKey
relationships. It does this by looping over all the fields in a Model
and storing a series of pointers in cache that point to objects that need invalidated when the object in question is invalidated.
The use of only()
in the Django ORM does some meta programming magic that overrides the unfetched attributes with Django's DeferredAttribute
implementation. Under normal circumstances an access to favorite_color
would invoke DeferredAttribute.__get__
(https://github.com/django/django/blob/18f3e79b13947de0bda7c985916d5a04e28936dc/django/db/models/query_utils.py#L121-L146) and fetch the attribute either from the result cache or the data source. It does this by fetching the undeferred representation of the Model
in question and calling another only()
query on it.
This is the problem when looping over the foreign keys in the Model
and accessing their values, Cachine Machine introduces an unintentional recursion. getattr(self, f.attname)
on an attribute that is deferred induces a fetch of a Model
that has the CachingMixin
applied and has deferred attributes. This starts the whole caching process over again.
The Question
I would like to open a PR to fix this and I believe the answer to this is as simple as skipping over the deferred attributes, but I'm not sure how to do it because accessing the attribute causes the fetch process to start.
If all I have is a handle on an instance of a Model
with a mix of deferred and undeferred attributes, Is there a way to determine if an attribute is a DeferredAttribute
without accessing it?
fks = dict((f, getattr(self, f.attname)) for f in self._meta.fields
if (isinstance(f, models.ForeignKey) and <f's value isn't a Deferred attribute))
Here is how to check if a field is deferred:
from django.db.models.query_utils import DeferredAttribute
is_deferred = isinstance(model_instance.__class__.__dict__.get(field.attname), DeferredAttribute):
Taken from: https://github.com/django/django/blob/1.9.4/django/db/models/base.py#L393