Pickle: dealing with updated class definitions

iestyn picture iestyn · Apr 29, 2013 · Viewed 7k times · Source

After a class definition is updated by recompiling a script, pickle refuses to serialize previously instantiated objects of that class, giving the error: "Can't pickle object: it's not the same object as "

Is there a way to tell pickle that it should ignore such cases? To just identify classes by name, ignore whichever internal unique ID is causing the mismatch?

I would definitely welcome as an answer the suggestion of an alternative, equivalent module which solves this problem in a convenient and robust manner.


For reference, here's my motivation:

I am creating a high productivity, rapid iteration development environment in which Python scripts are edited live. Scripts are repeatedly recompiled, but data persists across compiles. As part of the productivity goals, I am trying to use pickle for serialization, to avoid the cost of writing and updating explicit serialization code for constantly changing data structures.

Mostly I serialize built-in types. I am careful to avoid meaningful changes in the classes which I pickle, and when necessary I use the copy_reg.pickle mechanism to perform upconversion on unpickle.

Script recompilation prevents me from pickling objects at all, even if class definitions have not actually changed (or have only changed in a benign way).

Answer

Mike McKerns picture Mike McKerns · Oct 14, 2013

Unless you can unpack the earlier version of the class definition, the reference pickle needs to dump and load the instance is now gone. So this is "not possible".

However, if you did want to do it, you could save previous versions of your class definitions... and then it would just be that you'd have to trick pickle into referring to your old/saved class definitions, and not using the most current ones -- which might just amount to editing obj.__class__ or obj.__module__ to point to your old class. There may also be some other odd things in your class instance that also refer to the old class definition that you'd have to handle. Also, if you add or delete a class method, you may run in to some unexpected results, or have to deal with updating the instance accordingly. Another interesting twist is that you could make the unpickler always use the most current version of your class.

My serialization package, dill, has some methods that can dump compiled source from a live code object to a temporary file, and then serialize using that temporary file. It's one of the newer parts of the package, so it's not as robust as the rest of dill. Also, your use case is not a use case I'd considered, but I could see how it would be a nice feature to have.