UnicodeDecodeError: 'utf-16-le'

Eddie Snipes picture Eddie Snipes · Dec 5, 2017 · Viewed 18.5k times · Source

I'm trying to pull a list of tables from an access database. The script goes through and displays about a third of the tables and gets the following error:

Traceback (most recent call last): File "C:/GageInfo/sourcecode for GageTrakNotify/__test script.py", line 31, in for fld in cursor2.columns(rows.table_name): UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 132-133: illegal encoding

Any tips would be appreciated.

import pyodbc


cursor1 = conn.cursor()
cursor2 = conn.cursor()

tblCount = 0
for rows in cursor1.tables():
   if rows.table_type == "TABLE":
       tblCount += 1
       print(rows.table_name)
       for fld in cursor2.columns(rows.table_name):
           print(fld.table_name, fld.column_name)

conn.close()

Answer

Gord Thompson picture Gord Thompson · Aug 2, 2021

This is almost certainly caused by a known issue with cursor.columns() when using the Access ODBC driver if the table metadata includes "Descriptions" for the columns. The following GitHub issue includes a discussion of the problem and possible workarounds:

https://github.com/mkleehammer/pyodbc/issues/328

TL;DR: The Access ODBC driver returns trailing garbage bytes under those circumstances and some combinations of the bytes will not be valid UTF-16LE encoded characters. A pyodbc output converter function can be used to intercept the byte string and decode it without throwing an exception. The specific workaround is described here:

https://github.com/mkleehammer/pyodbc/issues/328#issuecomment-419655266