reduceByKey method not being found in Scala Spark

blue-sky picture blue-sky · May 30, 2014 · Viewed 12.7k times · Source

Attempting to run http://spark.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala from source.

This line:

val wordCounts = textFile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a, b) => a + b)

is throwing error

value reduceByKey is not a member of org.apache.spark.rdd.RDD[(String, Int)]
  val wordCounts = logData.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a, b) => a + b)

logData.flatMap(line => line.split(" ")).map(word => (word, 1)) returns a MappedRDD but I cannot find this type in http://spark.apache.org/docs/0.9.1/api/core/index.html#org.apache.spark.rdd.RDD

I'm running this code from Spark source so could be a classpath problem ? But required dependencies are on my classpath.

Answer

maasg picture maasg · May 30, 2014

You should import the implicit conversions from SparkContext:

import org.apache.spark.SparkContext._

They use the 'pimp up my library' pattern to add methods to RDD's of specific types. If curious, see SparkContext:1296