I want to query a PostgreSQL database and return the output as a Pandas dataframe.
I created a connection to the database with 'SqlAlchemy':
from sqlalchemy import create_engine
engine = create_engine('postgresql://user@localhost:5432/mydb')
I write a Pandas dataframe to a database table:
i=pd.read_csv(path)
i.to_sql('Stat_Table',engine,if_exists='replace')
Based on the docs, looks like pd.read_sql_query() should accept a SQLAlchemy engine:
a=pd.read_sql_query('select * from Stat_Table',con=engine)
But it throws an error:
ProgrammingError: (ProgrammingError) relation "stat_table" does not exist
I'm using Pandas version 0.14.1.
What's the right way to do this?
You are bitten by the case (in)sensitivity issues with PostgreSQL. If you quote the table name in the query, it will work:
df = pd.read_sql_query('select * from "Stat_Table"',con=engine)
But personally, I would advise to just always use lower case table names (and column names), also when writing the table to the database to prevent such issues.
From the PostgreSQL docs (http://www.postgresql.org/docs/8.0/static/sql-syntax.html#SQL-SYNTAX-IDENTIFIERS):
Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case
To explain a bit more: you have written a table with the name Stat_Table
to the database (and sqlalchemy will quote this name, so it will be written as "Stat_Table" in the postgres database). When doing the query 'select * from Stat_Table'
the unquoted table name will be converted to lower case stat_table
, and so you get the message that this table is not found.
See eg also Are PostgreSQL column names case-sensitive?