neildf
neildf

Reputation: 95

Have you indexed nutch crawl results using elasticsearch before?

Has anyone had any luck writing custom indexers for nutch to index the crawl results with elasticsearch? Or do you know of any that already exist?

Upvotes: 8

Views: 3039

Answers (4)

Duong Nguyen
Duong Nguyen

Reputation: 850

Time goes by and now Nucth is already integrated well with ElasticSearch. Here is a nice tutorial.

Upvotes: 0

Matt Weber
Matt Weber

Reputation: 342

I wrote an ElasticSearch plugin that mocks the Solr api. Using this plugin and the standard Nutch Solr indexer you can easily send crawled data into ElasticSearch. Plugin and an example of how to use it with Nutch can be found on GitHub:

https://github.com/mattweber/elasticsearch-mocksolrplugin

Upvotes: 10

ctjmorgan
ctjmorgan

Reputation: 31

I know that Nutch will be adding pluggable backends and glad to see it. I had a need to integrate elasticsearch with Nutch 1.3. Code is posted here. Piggybacked off the (src/java/org/apache/nutch/indexer/solr) code.

https://github.com/ctjmorgan/nutch-elasticsearch-indexer

Upvotes: 3

Julien Nioche
Julien Nioche

Reputation: 4864

Haven't done it but this is definitely doable but would require to piggyback the SOLR code (src/java/org/apache/nutch/indexer/solr) and adapt it to ElasticSearch. Would be a nice contrib to Nutch BTW

Upvotes: 2

Related Questions