Python Throwing "'utf8' codec can't decode byte 0xd0 in position 0" Error

Raj picture Raj · Oct 1, 2013 · Viewed 16.6k times · Source

I am trying to load a currently existing worksheet and import the text file (comma separated values) screenshot shown below,

Excel Sheet:

enter image description here

Text File:

enter image description here

I am using the code shown below:

# importing necessary modules for performing the required operation
    import glob
    import csv
    from openpyxl import load_workbook
    import xlwt

    #read the text file(s) using the CSV modules and read the dilimiters and quoutechar
    for filename in glob.glob("E:\Scripting_Test\Phase1\*.txt"):
        spamReader = csv.reader((open(filename, 'rb')), delimiter=',')


        #read the excel file and using xlwt modules and set the active sheet
        wb = load_workbook(filename=r"E:\Scripting_Test\SeqTem\Seq0001.xls")
        ws = wb.worksheets(0)


        #write the data that is in text file to excel file
        for rowx, row in enumerate(spamReader):
            for colx, value in enumerate(row):
                ws.write(rowx, colx, value)

        wb.save()

I am getting a following error message:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xd0 in position 0: invalid continuation byte

One more question: How can you tell python to import the text data starting from A3 column in the excel sheet?

Answer

Adam Morris picture Adam Morris · Oct 1, 2013

Unicode encoding confuses me, but can't you force the value to ignore invalid bytes by saying:

value = unicode(value, errors='ignore')

Here is a great answer for more reading on unicode: unicode().decode('utf-8', 'ignore') raising UnicodeEncodeError