Hafiz Muhammad Shafiq
Hafiz Muhammad Shafiq

Reputation: 8670

Nutch error "Limit reached, skipping further inlinks for"

My nutch version is 2.2.1 and it is working well for few days but now it is not going to crawl anything any gives following error like.

Limit reached, skipping further inlinks for de.ard.www:http/
Limit reached, skipping further inlinks for de.rbb-online.mediathek:http/

Limit reached, skipping further inlinks for de.rbb-online.www:http/

How to get rid of it?

Upvotes: 0

Views: 197

Answers (1)

Talat
Talat

Reputation: 68

This is not an error. Actually this means finds more inlinks than default setting (db.max.inlinks),only the first N inlinks will be stored, and the rest will be discarded.At the default db.max.inlinks is set 10000.

IMHO if you want to crawl more outlinks pages. You should increase db.max.outlinks.per.page settings. At the defualt it is set 100 per page.

Upvotes: 1

Related Questions