How to pipe data from AWS Postgres RDS to S3 (then Redshift)?

jenswirf picture jenswirf · Nov 6, 2014 · Viewed 14.6k times · Source

I'm using AWS data pipeline service to pipe data from a RDS MySql database to s3 and then on to Redshift, which works nicely.

However, I also have data living in an RDS Postres instance which I would like to pipe the same way but I'm having a hard time setting up the jdbc-connection. If this is unsupported, is there a work-around?

"connectionString": "jdbc:postgresql://THE_RDS_INSTANCE:5432/THE_DB”

Answer

PeterssonJesper picture PeterssonJesper · Dec 6, 2015

Nowadays you can define a copy-activity to extract data from a Postgres RDS instance into S3. In the Data Pipeline interface:

  1. Create a data node of the type SqlDataNode. Specify table name and select query
  2. Setup the database connection by specifying RDS instance ID (the instance ID is in your URL, e.g. your-instance-id.xxxxx.eu-west-1.rds.amazonaws.com) along with username, password and database name.
  3. Create a data node of the type S3DataNode
  4. Create a Copy activity and set the SqlDataNode as input and the S3DataNode as output