Is there a Spark SQL jdbc driver?

apache-spark jdbc apache-spark-sql azure-hdinsight

aaronsteers · Jun 9, 2016 · Viewed 10.7k times · Source

I'm looking for a client jdbc driver that supports Spark SQL.

I have been using Jupyter so far to run SQL statements on Spark (running on HDInsight) and I'd like to be able to connect using JDBC so I can use third-party SQL clients (e.g. SQuirreL, SQL Explorer, etc.) instead of the notebook interface.

I found an ODBC driver from Microsoft but this doesn't help me with java-based SQL clients. I also tried downloading the Hive jdbc driver from my cluster, but the Hive JDBC driver does not appear to support more advance SQL features that Spark does. For example, the Hive driver complains about not supporting join statements that are not equajoins, where I know that this is a supported feature of Spark because I've executed the same SQL in Jupyter successfully.

Answer

the Hive JDBC driver does not appear to support more advance SQL features that Spark does

Regardless of the support that it provides, the Spark Thrift Server is fully compatible with Hive/Beeline's JDBC connection.

Therefore, that is the JAR you need to use. I have verified this works in DBVisualizer.

The alternative solution would be to run Spark code in your Java clients (non-third party tools) directly and skip the need for the JDBC connection.

Is there a Spark SQL jdbc driver?

Answer

Related questions