How do you get WordCount.java to compile on Cloudera 4?

ChaseMedallion picture ChaseMedallion · Aug 11, 2012 · Viewed 8k times · Source

I'm trying to compile a simple WordCount.java map-reduce example on a linux (CentOS) installation of Cloudera 4. I keep hitting compiler errors when I reference any of the hadoop classes, but I can't figure out which jars of the hundreds under /usr/lib/hadoop I need to add to my classpath to get things to compile. Any help would be greatly appreciated! What I'd like most is a java file for word count (just in case the one I found is bad for some reason) along with the associated command to compile and run it.

I am trying to do this using just javac rather than Eclipse. My main issue either way is what exactly are the Hadoop libraries from the Cloudera 4 install which I need to include in order to get the classic WordCount example to compile. Basically, I need to put the Java MapReduce API classes (Mapper, Reducer, etc.) in my classpath.

Answer

Ben Reich picture Ben Reich · Aug 13, 2012

I have a script that builds my hadoop classes. Try:

#!/bin/bash

program=`echo $1 | awk -F "." '{print $1}'`

if [ ! -d "${program}_classes" ]
    then    mkdir ${program}_classes/;
fi

javac -classpath /usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1.jar:/usr/lib/hadoop/client/h\
adoop-mapreduce-client-core-2.0.0-cdh4.0.1.jar -d ${program}_classes/ $1

jar -cvf ${program}.jar -C ${program}_classes/ .;

You were probably missing the key jars:

 /usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1.jar

and

/usr/lib/hadoop/client/hadoop-mapreduce-client-core-2.0.0-cdh4.0.1.jar