How can I control what scalar form PyYAML uses for my data?

Ned Batchelder picture Ned Batchelder · Dec 27, 2011 · Viewed 15k times · Source

I've got an object with a short string attribute, and a long multi-line string attribute. I want to write the short string as a YAML quoted scalar, and the multi-line string as a literal scalar:

my_obj.short = "Hello"
my_obj.long = "Line1\nLine2\nLine3"

I'd like the YAML to look like this:

short: "Hello"
long: |
  Line1
  Line2
  Line3

How can I instruct PyYAML to do this? If I call yaml.dump(my_obj), it produces a dict-like output:

{long: 'line1

    line2

    line3

    ', short: Hello}

(Not sure why long is double-spaced like that...)

Can I dictate to PyYAML how to treat my attributes? I'd like to affect both the order and style.

Answer

xenosoz picture xenosoz · Oct 23, 2015

Falling in love with @lbt's approach, I got this code:

import yaml

def str_presenter(dumper, data):
  if len(data.splitlines()) > 1:  # check for multiline string
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
  return dumper.represent_scalar('tag:yaml.org,2002:str', data)

yaml.add_representer(str, str_presenter)

It makes every multiline string be a block literal.

I was trying to avoid the monkey patching part. Full credit to @lbt and @J.F.Sebastian.