Amit Valse
Amit Valse

Reputation: 95

elastic search on qbox is not accesible through nutch

I have a qbox instance for elastic search (more details about qbox elasticsearch can be found at http://qbox.io/) with a custom tcp port. When I try to access the instance through nutch for indexing I get following error:

java.lang.Exception: org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: []
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: []
    at org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:278)
    at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:197)
    at org.elasticsearch.client.transport.support.InternalTransportClient.execute(InternalTransportClient.java:106)
    at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:163)
    at org.elasticsearch.client.transport.TransportClient.bulk(TransportClient.java:364)
    at org.2016-06-30 15:39:14,320 WARN  mapred.FileOutputCommitter - Output path is null in cleanup
2016-06-30 15:39:14,320 WARN  mapred.LocalJobRunner - job_local561208907_0001
java.lang.Exception: org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: []
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: []
    at org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:278)
    at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:197)
    at org.elasticsearch.client.transport.support.InternalTransportClient.execute(InternalTransportClient.java:106)
    at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:163)
    at org.elasticsearch.client.transport.TransportClient.bulk(TransportClient.java:364)
    at org.elasticsearch.action.bulk.BulkRequestBuilder.doExecute(BulkRequestBuilder.java:164)
    at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:91)
    at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:65)
    at org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.commit(ElasticIndexWriter.java:208)
    at org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.close(ElasticIndexWriter.java:226)
    at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:114)
    at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:54)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:650)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTaselasticsearch.action.bulk.BulkRequestBuilder.doExecute(BulkRequestBuilder.java:164)
    at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:91)
    at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:65)
    at org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.commit(ElasticIndexWriter.java:208)
    at org.apache.nutch.indexwriter.elastic.ElasticIndexWriter.close(ElasticIndexWriter.java:226)
    at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:114)
    at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:54)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:650)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:767)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
k.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

Upvotes: 1

Views: 258

Answers (1)

Pranav Shukla
Pranav Shukla

Reputation: 2226

You usually get this error when either cluster host/port or cluster name does not match on client and server.

In your $NUTCH_ROOT/runtime/local/conf/nutch-site.xml please make sure you have correctly configured the host, port and cluster name that matches the cluster name on qbox.io -

<property>
  <name>elastic.host</name>
  <value></value>
  <description>The hostname to send documents to using TransportClient. Either host
  and port must be defined or cluster.</description>
</property>

<property> 
  <name>elastic.port</name>
  <value>9300</value>
  <description>The port to connect to using TransportClient.</description>
</property>

<property> 
  <name>elastic.cluster</name>
  <value></value>
  <description>The cluster name to discover. Either host and port must be defined
  or cluster.</description>
</property>

Upvotes: 0

Related Questions