Reputation: 1519
I when I type the following command into cygwin:
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*
then the binary works fine. When I place the exact same line into my bash script:
#!/bin/bash/
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*
I get an error saying some files don't exist. This may be specific to Nutch which is the program I'm running, but I think it has more to do with how I'm calling the command in the script. Any ideas about what's wrong and how to fix this? (yes I'm using tab completion)
EDIT:
Script:
#!/bin/bash
/home/Dan/apache-nutch-1.2/bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb crawl/segments/*
I run the command:
$ pwd
/home/Dan/apache-nutch-1.2
$ ./nutch.sh
The output I'm getting is:
Indexer: starting at 2010-11-29 15:15:44
Indexer: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_fetch
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_parse
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_data
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_text
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
at org.apache.nutch.indexer.Indexer.index(Indexer.java:76)
at org.apache.nutch.indexer.Indexer.run(Indexer.java:97)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.Indexer.main(Indexer.java:106)
Regards, ~DS
Upvotes: 0
Views: 1070
Reputation: 31296
Two things:
#!/bin/bash
. Also double check there is a bash
in /bin
.bin
directory in your currect folder. So if you're in $HOME
, and assuming you've got a path $HOME/bin/nutch
, then you'll be okay. But then if you change to /tmp
, then it'll fail as there's no such path as /tmp/bin/nutch
. You're better off giving the full absolute path name to nutch in the first place.Upvotes: 1