Is there a way to make Python ignore any .pyc files that are present and always interpret all the code (including imported modules) directly? Google hasn't turned up any answers, so I suspect not, but it seemed worth asking just in case.
(Why do I want to do this? I have a large pipeline of Python scripts which are run repeatedly over a cluster of a couple hundred computers. The Python scripts themselves live on a shared NFS filesystem. Somehow, rarely, after having been run hundreds of times over several hours, they will suddenly start crashing with an error about not being able to import a module. Forcing the regeneration of the .pyc file fixes the problem. I want, of course, to fix the underlying causes, but in the meantime we also need the system to continue running, so it seems like ignoring the .pyc files if possible would be a reasonable workaround).
P.S. I'm using Python 2.5, so I can't use -B.
You could use the standard Python library's imp module to reimplement __builtins__.__import__
, which is the hook function called by import
and from
statement. In particular, the imp.load_module function can be used to load a .py
even when the corresponding .pyc
is present. Be sure to study carefully all the docs in the page I've pointed to, plus those for import, as it's kind of a delicate job. The docs themselves suggest using import hooks instead (per PEP 302) but for this particular task I suspect that would be even harder.
BTW, likely causes for your observed problems include race conditions between different computers trying to write .pyc
files at the same time -- NFS locking is notoriously flaky and has always been;-). As long as every Python compiler you're using is at the same version (if not, you're in big trouble anyway;-), I'd rather precompile all of those .py
files into .pyc
and make their directories read-only; the latter seems the simplest approach anyway (rather than hacking __import__
), even if for some reason you can't precompile.