I'm a dummy on Ubuntu 16.04, desperately attempting to make Spark work.
I've tried to fix my problem using the answers found here on stackoverflow but I couldn't resolve anything.
Launching spark with the command ./spark-shell
from bin folder I get this message
WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable".
I'm using Java version is
java version "1.8.0_101
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode).
Spark is the latest version: 2.0.1 with Hadoop 2. 7. I've also retried with an older package of Spark, the 1.6.2 with Hadoop 2.4 but I get the same result. I also tried to install Spark on Windows but it seems harder than doing it on Ubuntu.
I also tried to run some commands on Spark from my laptop: I can define an object, I can create an RDD and store it in cache and I can use function like .map()
, but when I try to run the function .reduceByKey()
I receive several strings of error messages.
May be it's the Hadoop library which is compiled for 32bits, while I'm on 64bit?
Thanks.
Steps to fix:
HADOOP_HOME
to point to that directory.$HADOOP_HOME/lib/native
to LD_LIBRARY_PATH
.