Python MS Access Database Table Creation From Pandas Dataframe Using SQLAlchemy

Radical Edward picture Radical Edward · Dec 18, 2014 · Viewed 11.6k times · Source

I'm trying to create an MS Access database from Python and was wondering if it's possible to create a table directly from a pandas dataframe. I know that I can use pandas dataframe.to_sql() function to successfully write the dataframe to an SQLite database or by an using sqlalchemy engine for some other database format (but not Access unfortunately) but I can't get all the pieces parts to come together. Here's the code snippet that I've been testing with:

import pandas as pd
import sqlalchemy
import pypyodbc     # Used to actually create the .mdb file
import pyodbc

# Connection function to use for sqlalchemy
def Connection():
    MDB = 'C:\\database.mdb'
    DRV = '{Microsoft Access Driver (*.mdb)}'
    connection_string = 'Driver={Microsoft Access Driver (*.mdb)};DBQ=%s' % MDB
    return pyodbc.connect('DRIVER={};DBQ={}'.format(DRV,MDB))


# Try to connect to the database
try:
    Conn = Connection()
# If it fails because its not been created yet, create it and connect to it
except:
    pypyodbc.win_create_mdb(MDB)
    Conn = Connection()

# Create the sqlalchemy engine using the pyodbc connection
Engine = sqlalchemy.create_engine('mysql+pyodbc://', creator=Connection)

# Some dataframe
data = {'Values'     : [1., 2., 3., 4.],
        'FruitsAndPets'  : ["Apples", "Oranges", "Puppies", "Ducks"]}
df = pd.DataFrame(data)

# Try to send it to the access database (and fail)
df.to_sql('FruitsAndPets', Engine, index = False)

I'm not sure that what I'm trying to do is even possible with the current packages I'm using but I wanted to check here before I write my own hacky dataframe to MS Access table function. Maybe my sqlalchemy engine is set up wrong?

Here's the end of my error with mssql+pyodbc in the engine:

cursor.execute(statement, parameters)
sqlalchemy.exc.DBAPIError: (Error) ('HY000', "[HY000] [Microsoft][ODBC Microsoft Access Driver] Could not find file 'C:\\INFORMATION_SCHEMA.mdb'. (-1811) (SQLExecDirectW)") u'SELECT [COLUMNS_1].[TABLE_SCHEMA], [COLUMNS_1].[TABLE_NAME], [COLUMNS_1].[COLUMN_NAME], [COLUMNS_1].[IS_NULLABLE], [COLUMNS_1].[DATA_TYPE], [COLUMNS_1].[ORDINAL_POSITION], [COLUMNS_1].[CHARACTER_MAXIMUM_LENGTH], [COLUMNS_1].[NUMERIC_PRECISION], [COLUMNS_1].[NUMERIC_SCALE], [COLUMNS_1].[COLUMN_DEFAULT], [COLUMNS_1].[COLLATION_NAME] \nFROM [INFORMATION_SCHEMA].[COLUMNS] AS [COLUMNS_1] \nWHERE [COLUMNS_1].[TABLE_NAME] = ? AND [COLUMNS_1].[TABLE_SCHEMA] = ?' (u'FruitsAndPets', u'dbo')

and the ending error for mysql+pyodbc in the engine:

cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) ('42000', "[42000] [Microsoft][ODBC Microsoft Access Driver] Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE', 'SELECT', or 'UPDATE'. (-3500) (SQLExecDirectW)") "SHOW VARIABLES LIKE 'character_set%%'" ()

Just to note, I don't care if I use sqlalchemy or pandas to_sql() I just am looking for some easy way of getting a dataframe into my MS Access database easily. If that's dump to JSON then a loop function to insert rows using SQL manually, whatever, if it works well I'll take it.

Answer

FrancisWolcott picture FrancisWolcott · Jan 13, 2017

For those still looking into this, basically you can't use pandas to_sql method for MS Access without a great deal of difficulty. If you are determined to do it this way, here is a link where someone fixed sqlalchemy's Access dialect (and presumably the OP's code would work with this Engine):

connecting sqlalchemy to MSAccess

The best way to get a data frame into MS Access is to build the INSERT statments from the records, then simply connect via pyodbc or pypyodbc and execute them with a cursor. You have to do inserts one at a time, its probably best to break this up into chunks (around 5000) if you have a lot of data.