How to read a (static) file from inside a Python package?

ronszon picture ronszon · May 17, 2011 · Viewed 57.6k times · Source

Could you tell me how can I read a file that is inside my Python package?

My situation

A package that I load has a number of templates (text files used as strings) that I want to load from within the program. But how do I specify the path to such file?

Imagine I want to read a file from:

package\templates\temp_file

Some kind of path manipulation? Package base path tracking?

Answer

ankostis picture ankostis · Jan 2, 2014

TLDR; Use standard-library's importlib.resources module as explained in the method no 2, below.

The traditional pkg_resources from setuptools is not recommended anymore because the new method:

  • it is significantly more performant;
  • is is safer since the use of packages (instead of path-stings) raises compile-time errors;
  • it is more intuitive because you don't have to "join" paths;
  • it is faster when developing since you don't need an extra dependency (setuptools), but rely on Python's standard-library alone.

I kept the traditional listed first, to explain the differences with the new method when porting existing code (porting also explained here).



Let's assume your templates are located in a folder nested inside your module's package:

  <your-package>
    +--<module-asking-the-file>
    +--templates/
          +--temp_file                         <-- We want this file.

Note 1: For sure, we should NOT fiddle with the __file__ attribute (e.g. code will break when served from a zip).

Note 2: If you are building this package, remember to declatre your data files as package_data or data_files in your setup.py.

1) Using pkg_resources from setuptools(slow)

You may use pkg_resources package from setuptools distribution, but that comes with a cost, performance-wise:

import pkg_resources

# Could be any dot-separated package/module name or a "Requirement"
resource_package = __name__
resource_path = '/'.join(('templates', 'temp_file'))  # Do not use os.path.join()
template = pkg_resources.resource_string(resource_package, resource_path)
# or for a file-like stream:
template = pkg_resources.resource_stream(resource_package, resource_path)

Tips:

  • This will read data even if your distribution is zipped, so you may set zip_safe=True in your setup.py, and/or use the long-awaited zipapp packer from python-3.5 to create self-contained distributions.

  • Remember to add setuptools into your run-time requirements (e.g. in install_requires`).

... and notice that according to the Setuptools/pkg_resources docs, you should not use os.path.join:

Basic Resource Access

Note that resource names must be /-separated paths and cannot be absolute (i.e. no leading /) or contain relative names like "..". Do not use os.path routines to manipulate resource paths, as they are not filesystem paths.

2) Python >= 3.7, or using the backported importlib_resources library

Use the standard library's importlib.resources module which is more efficient than setuptools, above:

try:
    import importlib.resources as pkg_resources
except ImportError:
    # Try backported to PY<37 `importlib_resources`.
    import importlib_resources as pkg_resources

from . import templates  # relative-import the *package* containing the templates

template = pkg_resources.read_text(templates, 'temp_file')
# or for a file-like stream:
template = pkg_resources.open_text(templates, 'temp_file')

Attention:

Regarding the function read_text(package, resource):

  • The package can be either a string or a module.
  • The resource is NOT a path anymore, but just the filename of the resource to open, within an existing package; it may not contain path separators and it may not have sub-resources (i.e. it cannot be a directory).

For the example asked in the question, we must now:

  • make the <your_package>/templates/ into a proper package, by creating an empty __init__.py file in it,
  • so now we can use a simple (possibly relative) import statement (no more parsing package/module names),
  • and simply ask for resource_name = "temp_file" (no path).

Tips:

  • To access a file inside the current module, set the package argument to __package__, e.g. pkg_resources.read_text(__package__, 'temp_file') (thanks to @ben-mares).
  • Things become interesting when an actual filename is asked with path(), since now context-managers are used for temporarily-created files (read this).
  • Add the backported library, conditionally for older Pythons, with install_requires=[" importlib_resources ; python_version<'3.7'"] (check this if you package your project with setuptools<36.2.1).
  • Remember to remove setuptools library from your runtime-requirements, if you migrated from the traditional method.
  • Remember to customize setup.py or MANIFEST to include any static files.
  • You may also set zip_safe=True in your setup.py.