pyyaml and using quotes for strings only

Jeroen Jacobs picture Jeroen Jacobs · Jul 14, 2016 · Viewed 13.7k times · Source

I have the following YAML file:

---
my_vars:
  my_env: "dev"
  my_count: 3

When I read it with PyYAML and dump it again, I get the following output:

---
my_vars:
  my_env: dev
  my_count: 3

The code in question:

with open(env_file) as f:
    env_dict = yaml.load(f)
    print(yaml.dump(env_dict, indent=4, default_flow_style=False, explicit_start=True))

I tried using the default_style parameter:

with open(env_file) as f:
    env_dict = yaml.load(f)
    print(yaml.dump(env_dict, indent=4, default_flow_style=False, explicit_start=True, default_style='"'))

But now I get:

---
"my_vars":
  "my_env": "dev"
  "my_count": !!int "3"

What do I need to do to keep the original formatting, without making any assumptions about the variable names in the YAML file?

Answer

Anthon picture Anthon · Jul 26, 2016

I suggest you update to using YAML 1.2 (released in 2009) with the backwards compatible ruamel.yaml package instead of using PyYAML which implements most of YAML 1.1 (2005). (Disclaimer: I am the author of that package).

Then you just specify preserve_quotes=True when loading for round-tripping the YAML file:

import sys
import ruamel.yaml

yaml_str = """\
---
my_vars:
  my_env: "dev"    # keep "dev" quoted
  my_count: 3
"""

data = ruamel.yaml.round_trip_load(yaml_str, preserve_quotes=True)
ruamel.yaml.round_trip_dump(data, sys.stdout, explicit_start=True)

which outputs (including the preserved comment):

---
my_vars:
  my_env: "dev"    # keep "dev" quoted
  my_count: 3

After loading the string scalars will be a subclass of string, to be able to accommodate the quoting info, but will work like a normal string for all other purposes. If you want to replace such a string though (dev to fgw) you have to cast the string to this subclass ( DoubleQuotedScalarString from ruamel.yaml.scalarstring).

When round-tripping ruamel.yaml by default preserves the order (by insertion) of the keys.