Single versus double quotes in json loads in Python

user248237 picture user248237 · Jan 15, 2013 · Viewed 23.1k times · Source

I notice that single quotes cause simplejson's loads function to fail:

>>> import simplejson as json
>>> json.loads("\"foo\"")
'foo'
>>> json.loads("\'foo\'")
Traceback (most recent call last):
...
ValueError: No JSON object could be decoded

I'm parsing things like: foo = ["a", "b", "c"] from a textfile into lists in Python and would like to also accept foo = ['a', 'b', 'c']. simplejson is convenient for making foo automatically into a list.

How can I get loads to accept single quotes, or automatically substitute double for single quotes without wrecking the input? thanks.

Answer

Martijn Pieters picture Martijn Pieters · Jan 15, 2013

Use the proper tool for the job, you are not parsing JSON but Python, so use ast.literal_eval() instead:

>>> import ast
>>> ast.literal_eval('["a", "b", "c"]')
['a', 'b', 'c']
>>> ast.literal_eval("['a', 'b', 'c']")
['a', 'b', 'c']
>>> ast.literal_eval('["mixed", \'quoting\', """styles"""]')
['mixed', 'quoting', 'styles']
  • JSON documents always use double quotes for strings, use UTF-16 for \uhhhh hex escape syntax, have {...} objects for key-value pairs with keys always strings and sequences are always [...] lists, and use null, true and false values; note the lowercase booleans. Numbers come in integer and floating point forms.

  • In Python, string representations can use single and double quotes, Unicode escapes use \uhhhh and \Uhhhhhhhh forms (no UTF-16 surrogate pairs), dictionaries with {...} display syntax can have keys in many different types rather than just strings, sequences can be lists ([...]) but can also use tuples ((...)), or you could have other container types still. Python has None, True and False (Titlecase!) and numbers come in integers, floats, and complex forms.

Confusing one with the other can either lead to parse errors or subtle problems when decoding happened to succeed but the data has been wrongly interpreted, such as with escaped non-BMP codepoints such Emoji. Make sure to use the right method to decode them! And in most cases when you do have Python syntax data someone actually used the wrong method of encoding and only accidentally produced Python representations. See if the source needs fixing in that case; usually the output was produced by using str(object) where json.dumps(obj) should have been used instead.