Reputation: 3270
Iam trying to configure nutch for running multi-threaded crawling.
However , Iam facing an issue. I am not able to run crawl with multiple threads , I have modified the nutch-site.xml to use 25 threads but still I can see only 1 Threads running.
<property>
<name>fetcher.threads.fetch</name>
<value>25</value>
<description>The number of FetcherThreads the fetcher should use.
This is also determines the maximum number of requests that are
made at once (each FetcherThread handles one connection).</description>
</property>
<property>
<name>fetcher.threads.per.host</name>
<value>25</value>
<description>This number is the maximum number of threads that
should be allowed to access a host at one time.</description>
</property>
I always get the value of activeThreads=25, spinWaiting=24, fetchQueues.totalSize=some value.
Whats the meaning of this, can you please explain whats the issue and how can I solve it.
I will highly appreciate your help.
Thanks, Sumit
Upvotes: 1
Views: 2430
Reputation:
I think your issue is related to a known bug w/the new Nutch fetcher. See NUTCH-721.
You can try using OldFetcher (if you have Nutch 1.0) to see if that solves your problem.
-- Ken
Upvotes: 2