"Unicode Error "unicodeescape" codec can't decode bytes... Cannot open text files in Python 3

Eric picture Eric · Aug 28, 2009 · Viewed 697k times · Source

I am using python 3.1, on a windows 7 machines. Russian is the default system language, and utf-8 is the default encoding.

Looking at the answer to a previous question, I have attempting using the "codecs" module to give me a little luck. Here's a few examples:

>>> g = codecs.open("C:\Users\Eric\Desktop\beeline.txt", "r", encoding="utf-8")
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \UXXXXXXXX escape (<pyshell#39>, line 1)
>>> g = codecs.open("C:\Users\Eric\Desktop\Site.txt", "r", encoding="utf-8")
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \UXXXXXXXX escape (<pyshell#40>, line 1)
>>> g = codecs.open("C:\Python31\Notes.txt", "r", encoding="utf-8")
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 11-12: malformed \N character escape (<pyshell#41>, line 1)
>>> g = codecs.open("C:\Users\Eric\Desktop\Site.txt", "r", encoding="utf-8")
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \UXXXXXXXX escape (<pyshell#44>, line 1)

My last idea was, I thought it might have been the fact that windows "translates" a few folders, such as the "users" folder, into Russian (though typing "users" is still the correct path), so I tried it in the Python31 folder. Still, no luck. Any ideas?

Answer

Martin v. L&#246;wis picture Martin v. Löwis · Aug 28, 2009

The problem is with the string

"C:\Users\Eric\Desktop\beeline.txt"

Here, \U in "C:\Users... starts an eight-character Unicode escape, such as \U00014321. In your code, the escape is followed by the character 's', which is invalid.

You either need to duplicate all backslashes:

"C:\\Users\\Eric\\Desktop\\beeline.txt"

Or prefix the string with r (to produce a raw string):

r"C:\Users\Eric\Desktop\beeline.txt"