I have some trouble with loading a pickled file in a module that is different from the module where I pickled the file. I am aware of the following thread: Unable to load files using pickle and multipile modules. I've tried the proposed solution of importing the class into the module where I am unpickling my file, but it keeps giving me the same error:
AttributeError: Can't get attribute 'Document' on <module '__main__' from ''>
The basic structure of what I am trying to do:
Util file that pickles and unpickles objects, utils.py:
import pickle
def save_document(doc):
from class_def import Document
write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)
def load_document(file_path):
from class_def import Document
doc_file = open(file_path, 'rb')
return pickle.load(doc_file)
File where Document object is defined and the save util method is called, class_def.py:
import utils
class Document(object):
data = ""
if __name__ == '__main__':
doc = Document()
utils.save_document(doc)
File where the load util method is called, process.py:
import utils
if __name__ == '__main__':
utils.load_document(file_path)
Running process.py gives the mentioned AttributeError. If I import the class_def.py file into process.py and run its main method as mentioned in the original thread it works, but I want to be able to run these two modules separately, since the class_def file is a preprocessing step that takes quite some time. How could I solve this?
in your class_def.py
file you have this code:
if __name__ == '__main__':
doc = Document()
utils.save_document(doc)
This means that doc
will be a __main__.Document
object, so when it is pickled it is expecting to be able to get a Document
class from the main module, to fix this you need to use the definition of Document
from a module called class_def
meaning you would add an import here:
(in general you can just do from <own module name> import *
right inside the if __name__ == "__main__"
)
if __name__ == '__main__':
from class_def import Document
# ^ so that it is using the Document class defined under the class_def module
doc = Document()
utils.save_document(doc)
that way it will need to run the class_def.py file twice, once as __main__
and once as class_def
but it does mean that the data will be pickled as a class_def.Document
object so loading it will retrieve the class from the correct place. Otherwise if you have a way of constructing one document object from another you can do something like this in utils.py
:
def save_document(doc):
if doc.__class__.__module__ == "__main__":
from class_def import Document #get the class from the reference-able module
doc = Document(doc) #convert it to the class we are able to use
write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)
Although usually I'd prefer the first way.