Creating Custom Tag in PyYAML

doremi picture doremi · Mar 28, 2017 · Viewed 9k times · Source

I'm trying to use Python's PyYAML to create a custom tag that will allow me to retrieve environment variables with my YAML.

import os
import yaml

class EnvTag(yaml.YAMLObject):
    yaml_tag = u'!Env'

    def __init__(self, env_var):
       self.env_var = env_var

    def __repr__(self):
       return os.environ.get(self.env_var)

settings_file = open('conf/defaults.yaml', 'r')
settings = yaml.load(settings_file)

And inside of defaults.yaml is simply:

example: !ENV foo

The error I keep getting:

yaml.constructor.ConstructorError: 
could not determine a constructor for the tag '!ENV' in 
"defaults.yaml", line 1, column 10

I plan to have more than one custom tag as well (assuming I can get this one working)

Answer

Fredrick Brennan picture Fredrick Brennan · Mar 28, 2017

Your PyYAML class had a few problems:

  1. yaml_tag is case sensitive, so !Env and !ENV are different tags.
  2. So, as per the documentation, yaml.YAMLObject uses meta-classes to define itself, and has default to_yaml and from_yaml functions for those cases. By default, however, those functions require that your argument to your custom tag (in this case !ENV) be a mapping. So, to work with the default functions, your defaults.yaml file must look like this (just for example) instead:

example: !ENV {env_var: "PWD", test: "test"}

Your code will then work unchanged, in my case print(settings) now results in {'example': /home/Fred} But you're using load instead of safe_load -- in their answer below, Anthon pointed out that this is dangerous because the parsed YAML can overwrite/read data anywhere on the disk.

You can still easily use your YAML file format, example: !ENV foo—you just have to define an appropriate to_yaml and from_yaml in class EnvTag, ones that can parse and emit scalar variables like the string "foo".

So:

import os
import yaml

class EnvTag(yaml.YAMLObject):
    yaml_tag = u'!ENV'

    def __init__(self, env_var):
        self.env_var = env_var

    def __repr__(self):
        v = os.environ.get(self.env_var) or ''
        return 'EnvTag({}, contains={})'.format(self.env_var, v)

    @classmethod
    def from_yaml(cls, loader, node):
        return EnvTag(node.value)

    @classmethod
    def to_yaml(cls, dumper, data):
        return dumper.represent_scalar(cls.yaml_tag, data.env_var)

# Required for safe_load
yaml.SafeLoader.add_constructor('!ENV', EnvTag.from_yaml)
# Required for safe_dump
yaml.SafeDumper.add_multi_representer(EnvTag, EnvTag.to_yaml)

settings_file = open('defaults.yaml', 'r')

settings = yaml.safe_load(settings_file)
print(settings)

s = yaml.safe_dump(settings)
print(s)

When this program is run, it outputs:

{'example': EnvTag(foo, contains=)}
{example: !ENV 'foo'}

This code has the benefit of (1) using the original pyyaml, so nothing extra to install and (2) adding a representer. :)