Load pickled object in different file - Attribute error

lmartens picture lmartens · Oct 27, 2016 · Viewed 17.8k times · Source

I have some trouble with loading a pickled file in a module that is different from the module where I pickled the file. I am aware of the following thread: Unable to load files using pickle and multipile modules. I've tried the proposed solution of importing the class into the module where I am unpickling my file, but it keeps giving me the same error: AttributeError: Can't get attribute 'Document' on <module '__main__' from ''>

The basic structure of what I am trying to do:

Util file that pickles and unpickles objects, utils.py:

import pickle

def save_document(doc):

    from class_def import Document

    write_file = open(file_path, 'wb')
    pickle.dump(doc, write_file)

def load_document(file_path):
    from class_def import Document

    doc_file = open(file_path, 'rb')
    return pickle.load(doc_file)

File where Document object is defined and the save util method is called, class_def.py:

import utils

class Document(object):
    data = ""

if __name__ == '__main__':
    doc = Document()
    utils.save_document(doc)

File where the load util method is called, process.py:

import utils

if __name__ == '__main__':
     utils.load_document(file_path)

Running process.py gives the mentioned AttributeError. If I import the class_def.py file into process.py and run its main method as mentioned in the original thread it works, but I want to be able to run these two modules separately, since the class_def file is a preprocessing step that takes quite some time. How could I solve this?

Answer

Tadhg McDonald-Jensen picture Tadhg McDonald-Jensen · Oct 27, 2016

in your class_def.py file you have this code:

if __name__ == '__main__':
    doc = Document()
    utils.save_document(doc)

This means that doc will be a __main__.Document object, so when it is pickled it is expecting to be able to get a Document class from the main module, to fix this you need to use the definition of Document from a module called class_def meaning you would add an import here:

(in general you can just do from <own module name> import * right inside the if __name__ == "__main__")

if __name__ == '__main__':
    from class_def import Document 
    # ^ so that it is using the Document class defined under the class_def module
    doc = Document()
    utils.save_document(doc)

that way it will need to run the class_def.py file twice, once as __main__ and once as class_def but it does mean that the data will be pickled as a class_def.Document object so loading it will retrieve the class from the correct place. Otherwise if you have a way of constructing one document object from another you can do something like this in utils.py:

def save_document(doc):
    if doc.__class__.__module__ == "__main__":
        from class_def import Document #get the class from the reference-able module
        doc = Document(doc) #convert it to the class we are able to use


    write_file = open(file_path, 'wb')
    pickle.dump(doc, write_file)

Although usually I'd prefer the first way.