Are there any examples of how to pass parameters with an SQL query in Pandas?
In particular I'm using an SQLAlchemy engine to connect to a PostgreSQL database. So far I've found that the following works:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN %s AND %s'),
db,params=[datetime(2014,6,24,16,0),datetime(2014,6,24,17,0)],
index_col=['Timestamp'])
The Pandas documentation says that params can also be passed as a dict, but I can't seem to get this to work having tried for instance:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN :dstart AND :dfinish'),
db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
index_col=['Timestamp'])
What is the recommended way of running these types of queries from Pandas?
The read_sql
docs say this params
argument can be a list, tuple or dict (see docs).
To pass the values in the sql query, there are different syntaxes possible: ?
, :1
, :name
, %s
, %(name)s
(see PEP249).
But not all of these possibilities are supported by all database drivers, which syntax is supported depends on the driver you are using (psycopg2
in your case I suppose).
In your second case, when using a dict, you are using 'named arguments', and according to the psycopg2
documentation, they support the %(name)s
style (and so not the :name
I suppose), see http://initd.org/psycopg/docs/usage.html#query-parameters.
So using that style should work:
df = psql.read_sql(('select "Timestamp","Value" from "MyTable" '
'where "Timestamp" BETWEEN %(dstart)s AND %(dfinish)s'),
db,params={"dstart":datetime(2014,6,24,16,0),"dfinish":datetime(2014,6,24,17,0)},
index_col=['Timestamp'])