native-lzo library not available on Hadoop datanodes

Carl Sagan picture Carl Sagan · Aug 5, 2013 · Viewed 7.8k times · Source

I've written a simple LzoWordCount the following to my Gateway/hadoop-env.sh:

HADOOP_CLASSPATH=/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/hadoop-lzo-cdh4-0.4.15-gplextras.jar
JAVA_LIBRARY_PATH=/opt/cloudera/parcels/HADOOP_LZO-0.4.15-1.gplextras.p0.105/lib/hadoop/lib/native/

When I run the MR job, I get:

mapred.JobClient: Task Id : attempt_201307311800_0020_m_000002_2, Status : FAILED java.lang.RuntimeException: native-lzo library not available

Any ideas how to fix this issue? I did notice that 'hadoop classpath | grep native' returns nothing?

Answer

Carl Sagan picture Carl Sagan · Aug 7, 2013

The problem turned out to be that we did not have lzop installed on the datanodes. I fixed it using:

sudo apt-get install lzop