I am trying to use Spark Cassandra Connector in Spark 1.1.0.
I have successfully built the jar file from the master branch on GitHub and have gotten the included demos to work. However, when I try to load the jar files into the spark-shell
I can't import any of the classes from the com.datastax.spark.connector
package.
I have tried using the --jars
option on spark-shell
and adding the directory with the jar file to Java's CLASSPATH. Neither of these options work. In fact, when I use the --jars
option, the logging output shows that the Datastax jar is getting loaded, but I still cannot import anything from com.datastax
.
I have been able to load the Tuplejump Calliope Cassandra connector into the spark-shell
using --jars
, so I know that's working. It's just the Datastax connector which is failing for me.
I got it. Below is what I did:
$ git clone https://github.com/datastax/spark-cassandra-connector.git
$ cd spark-cassandra-connector
$ sbt/sbt assembly
$ $SPARK_HOME/bin/spark-shell --jars ~/spark-cassandra-connector/spark-cassandra-connector/target/scala-2.10/connector-assembly-1.2.0-SNAPSHOT.jar
In scala prompt,
scala> sc.stop
scala> import com.datastax.spark.connector._
scala> import org.apache.spark.SparkContext
scala> import org.apache.spark.SparkContext._
scala> import org.apache.spark.SparkConf
scala> val conf = new SparkConf(true).set("spark.cassandra.connection.host", "my cassandra host")
scala> val sc = new SparkContext("spark://spark host:7077", "test", conf)