I'm trying to run the Terasort benchmarks and i'm getting the following exception:
java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask$OldOutputCollector.<init>(MapTask.java:573)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 10 more
Caused by: java.lang.IllegalArgumentException: can't read paritions file
at org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.configure(TeraSort.java:213)
... 15 more
Caused by: java.io.FileNotFoundException: File _partition.lst does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:371)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:720)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
at org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.readPartitions(TeraSort.java:153)
at org.apache.hadoop.examples.terasort.TeraSort$TotalOrderPartitioner.configure(TeraSort.java:210)
... 15 more
The TeraGen commands run fine and have created the input files for TeraSort. Here is the listing of my input directory:
bin/hadoop fs -ls /user/hadoop/terasort-input/Warning: Maximum heap size rounded up to 1024 MB
Found 5 items
-rw-r--r-- 1 sqatest supergroup 0 2012-01-23 14:13 /user/hadoop/terasort-input/_SUCCESS
drwxr-xr-x - sqatest supergroup 0 2012-01-23 13:30 /user/hadoop/terasort-input/_logs
-rw-r--r-- 1 sqatest supergroup 129 2012-01-23 15:49 /user/hadoop/terasort-input/_partition.lst
-rw-r--r-- 1 sqatest supergroup 50000000000 2012-01-23 13:30 /user/hadoop/terasort-input/part-00000
-rw-r--r-- 1 sqatest supergroup 50000000000 2012-01-23 13:30 /user/hadoop/terasort-input/part-00001
Here is my command for running the terasort:
bin/hadoop jar hadoop-examples-0.20.203.0.jar terasort -libjars hadoop-examples-0.20.203.0.jar /user/hadoop/terasort-input /user/hadoop/terasort-output
I do see the file _partition.lst in my input directory, i dont understand why i am getting the FileNotFoundException.
I followed the setup details provided at: http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/
I got this to work as follows:
I'm running in local mode from my hadoop base directory, hadoop-1.0.0 with an input subdirectory under it, and I get the same error you do.
I edited the failing java file to get it to log the path instead of the filename, rebuilt it ("ant binary"), and reran it. It was looking for the file in the directory I was running from. I have no idea if it was looking in the hadoop base dir or the execution dir.
...so I made a symbolic link in the directory I run terasort in pointing to the real file in the input directory.
It's a cheap hack, but it works.
- Tim.