I've written a simple LzoWordCount the following to my Gateway/hadoop-env.sh:
HADOOP_CLASSPATH=/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/hadoop-lzo-cdh4-0.4.15-gplextras.jar
JAVA_LIBRARY_PATH=/opt/cloudera/parcels/HADOOP_LZO-0.4.15-1.gplextras.p0.105/lib/hadoop/lib/native/
When I run the MR job, I get:
mapred.JobClient: Task Id : attempt_201307311800_0020_m_000002_2, Status : FAILED java.lang.RuntimeException: native-lzo library not available
Any ideas how to fix this issue? I did notice that 'hadoop classpath | grep native' returns nothing?
The problem turned out to be that we did not have lzop installed on the datanodes. I fixed it using:
sudo apt-get install lzop