How to deserialize an object with PyYAML using safe_load?

systempuntoout picture systempuntoout · Apr 13, 2010 · Viewed 12.1k times · Source

Having a snippet like this:

import yaml
class User(object):
    def __init__(self, name, surname):
       self.name= name
       self.surname= surname

user = User('spam', 'eggs')
serialized_user = yaml.dump(user)
#Network
deserialized_user = yaml.load(serialized_user)
print "name: %s, sname: %s" % (deserialized_user.name, deserialized_user.surname)

Yaml docs says that it is not safe to call yaml.load with any data received from an untrusted source; so, what should I modify to my snippet\class to use safe_load method?
Is it possible?

Answer

Petr Viktorin picture Petr Viktorin · May 23, 2010

Another way exists. From the PyYaml docs:

A python object can be marked as safe and thus be recognized by yaml.safe_load. To do this, derive it from yaml.YAMLObject [...] and explicitly set its class property yaml_loader to yaml.SafeLoader.

You also have to set the yaml_tag property to make it work.

YAMLObject does some metaclass magic to make the object loadable. Note that if you do this, the objects will only be loadable by the safe loader, not with regular yaml.load().

Working example:

import yaml

class User(yaml.YAMLObject):
    yaml_loader = yaml.SafeLoader
    yaml_tag = u'!User'

    def __init__(self, name, surname):
       self.name= name
       self.surname= surname

user = User('spam', 'eggs')
serialized_user = yaml.dump(user)

#Network

deserialized_user = yaml.safe_load(serialized_user)
print "name: %s, sname: %s" % (deserialized_user.name, deserialized_user.surname)

The advantage of this one is that it's prety easy to do; the disadvantages are that it only works with safe_load and clutters your class with serialization-related attributes and metaclass.