UnicodeDecodeError: 'ascii' codec can't decode byte in 0xc3 in position 304: ordinal not in range(128)

user1590499 picture user1590499 · Aug 11, 2012 · Viewed 11k times · Source

I just left the PC at work (using Python 2.7) and had a script that I was just finishing up (reproduced below). It ran fine at work, I just wanted to add one or two things. But I came home and am using my Mac's version of Python (3.2.2) and I get the following error:

Traceback (most recent call last):
  File "/Users/Downloads/sda/alias.py", line 25, in <module>
    for row_2 in in_csv:
  File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 304: ordinal not in range(128)

My code is here:

import csv
inname = "Fund_Aliases.csv"
outname = "output.csv"

def first_word(value):
    return value.split(" ", 1)[0]

with open(inname, "r") as infile:
    with open(outname, "w") as out file:
      in_csv = csv.reader(infile)
      out_csv = csv.writer(outfile)

     column_names = next(in_csv)
     out_csv.writerow(column_names)

      id_index = column_names.index("id")
      name_index = column_names.index("name")

      try:
          row_1 = next(in_csv)
          written_row = False

          for row_2 in in_csv:
            if first_word(row_1[name_index]) == first_word(row_2[name_index]) and row_1[id_index] != row_2[id_index]:
                if not written_row:
                    out_csv.writerow(row_1)

                out_csv.writerow(row_2)
                written_row = True
            else:
                written_row = False

            row_1 = row_2
      except StopIteration:
        # No data rows!
        pass

Answer

unutbu picture unutbu · Aug 11, 2012

It looks like Fund_Aliases.csv is not an ascii file.

According to the Python3 docs:

Since open() is used to open a CSV file for reading, the file will by default be decoded into unicode using the system default encoding (see locale.getpreferredencoding()). To decode a file using a different encoding, use the encoding argument of open:

with open('some.csv', newline='', encoding='utf-8') as f:
    reader = csv.reader(f)

So try specifying the encoding parameter.