Hadoop word count example fails with 'not a SequentialFile'. How set file format?

Question

I'm trying to run hadoop jar /usr/lib/hadoop/hadoop-examples.jar aggregatewordcount /data/gutenberg/huckfinn.txt output/guten4 but get an error "huckfinn.txt not a SequenceFile".

I read on other sites, and see in the source of this example file that there is an argument textinputformat that I'm guessing fixes this. I can't figure out what to specify for it though.

If I run hadoop jar /usr/lib/hadoop/hadoop-examples.jar aggregatewordcount /data/gutenberg/huckfinn.txt output/guten5 2 textinputformat, I get a different error, "java.lang.RuntimeException: Error in configuring object"

Josh Rosen · Accepted Answer

According to the mailing list post linked from your question, the java.lang.RuntimeException: Error in configuring object exception is caused by the example's dependencies not being on the tasktracker's classpath. You can see this from the full traceback: when I run your second command on my machine, I get:

java.lang.RuntimeException: Error in configuring object
    [...]
Caused by: java.lang.reflect.InvocationTargetException
    [...]
Caused by: java.lang.RuntimeException: Error in configuring object
    [...]
Caused by: java.lang.reflect.InvocationTargetException
    [...]
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.examples.AggregateWordCount$WordCountPlugInClass
    [...]
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.examples.AggregateWordCount$WordCountPlugInClass
    [...]

This post on the Cloudera blog discusses the different methods of providing dependencies to the tasktrackers.

To run the aggregatewordcount example, I used the -libjars option:

hadoop jar hadoop-examples.jar aggregatewordcount -libjars hadoop-examples.jar /data/gutenberg/huckfinn.txt output/guten7 2 textinputformat

Hadoop word count example fails with 'not a SequentialFile'. How set file format?

Answers (2)

Related Questions

Hadoop word count example fails with &#39;not a SequentialFile&#39;. How set file format?

Answers (2)

Related Questions

Hadoop word count example fails with 'not a SequentialFile'. How set file format?