How to write data to Redshift that is a result of a dataframe created in Python?

Sahil picture Sahil · Jul 15, 2016 · Viewed 36.6k times · Source

I have a dataframe in Python. Can I write this data to Redshift as a new table? I have successfully created a db connection to Redshift and am able to execute simple sql queries. Now I need to write a dataframe to it.

Answer

Andrew picture Andrew · Sep 26, 2016

You can use to_sql to push data to a Redshift database. I've been able to do this using a connection to my database through a SQLAlchemy engine. Just be sure to set index = False in your to_sql call. The table will be created if it doesn't exist, and you can specify if you want you call to replace the table, append to the table, or fail if the table already exists.

from sqlalchemy import create_engine
import pandas as pd

conn = create_engine('postgresql://username:[email protected]:5439/yourdatabase')

df = pd.DataFrame([{'A': 'foo', 'B': 'green', 'C': 11},{'A':'bar', 'B':'blue', 'C': 20}])

df.to_sql('your_table', conn, index=False, if_exists='replace')

Note that you may need to pip install psycopg2 in order to connect to Redshift through SQLAlchemy.

to_sql Documentation