I am trying to use dplyr to connect to a remote database, that I usually query through a ssh tunnel.
I first set up a ssh tunnel like the following:
alias tunnel_ncg='ssh -fNg -L 3307:127.0.0.1:3306 mysqluser@myhost mysql5 -h 127.0.0.1:3306 -P 3307 -u mysqluser -p mypassword'
At this point I can access the database by connecting to localhost:3307. For example:
mysql -h '127.0.0.1' -P 3307 -u mysqluser
If I try to access the same database through dplyr, I get an error complaining that it can't connect to the local MySQL socket:
> conDplyr = src_mysql(dbname = "mydb", user = "mysqluser", password = "mypassword", host = "localhost", port=3307)
Error in .local(drv, ...) :
Failed to connect to database: Error: Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)
My understanding is that RMySQL/dplyr are trying to looking for a socket file in the local computer, however they should really be looking for it in the remote server. Is there a way to fix this, or a work-around?
UPDATE:
If I try to connect through dbConnect/RMySQL, the connection works fine:
> dbConnect(dbDriver("MySQL"), user="mysqluser", password="mypassword", dbname="mydb", host="127.0.0.1", port=3307)
<MySQLConnection:0,1>
As silly as it sounds replacing localhost
with an IP address (127.0.0.1
) solves the problem.
src_mysql(
dbname = "mydb", user = "mysqluser", password = "mypassword",
host = "127.0.0.1", port=3307)
For an explanation take a look at the MySQL documentation:
On Unix, MySQL programs treat the host name localhost specially, in a way that is likely different from what you expect compared to other network-based programs.
For connections to localhost, MySQL programs attempt to connect to the local server by using a Unix socket file. This occurs even if a --port or -P option is given to specify a port number.
To ensure that the client makes a TCP/IP connection to the local server, use --host or -h to specify a host name value of 127.0.0.1, or the IP address or name of the local server.