Hafiz Muhammad Shafiq
Hafiz Muhammad Shafiq

Reputation: 8670

Apache Nutch SolrIndexer error in SolrCloud mode

I have configured Apache Nutch 2.3.1 and crawled few websites. I have to index these documents to Solr (6.6.3) that is running in Cloud mode. When I execute solrindex command, I got following exception

2018-05-02 13:10:40,679 INFO [main] org.apache.hadoop.mapred.MapTask: Ignoring exception during close for org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector@3bd3d05e
java.io.IOException: org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://10.11.22.156:8983/solr/collection2
    at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:103)
    at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:114)
    at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:54)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670)
    at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2019)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:797)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.solr.client.solrj.SolrServerException: Server refused connection at: http://10.11.22.156:8983/solr/collection2
    at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:559)
    at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
    at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
    at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
    at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:97)
    ... 11 more
Caused by: org.apache.http.conn.HttpHostConnectException: Connection to http://10.11.22.156:8983 refused
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:190)
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:643)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:479)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
    at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:448)
    ... 17 more
Caused by: java.net.ConnectException: Connection timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)

Where is the problem? If I repeat the same job with solr without cloud mode, it works fines.

Upvotes: 0

Views: 157

Answers (1)

Dimanshu Parihar
Dimanshu Parihar

Reputation: 377

The error straight forward showing that you have an Apache Nutch server, which is unable to access this particular node and port of Apache Solr http://10.11.22.156:8983/solr/collection2.

You need access between these two servers to make them communicate with each other :

  1. You need to provide outbound permission to solr server to do request response with Apache Nutch server.
  2. Also, you need to provide inbound permission to Apache Nutch server to access the given solr IP and port.

Upvotes: 4

Related Questions