Milan Verescak
Milan Verescak

Reputation: 33

Is it possible to pause and resume crawling using Java crawler crawler4j?

I already know that you can configure crawling to be resumable.

But is it possible to use resumable functionality to pause crawling process and then resume crawling later programmatically? E.g. I can gracefully shutdown crawling with shutdown method of the crawler and with resumable parameter set to true, then start again crawling.

Will it work this way, because primary purpose of resumable parameter is to handle accidental crashes of crawler. Is there any other or better way how to achieve this functionality with crawler4j?

Upvotes: 0

Views: 297

Answers (1)

rzo1
rzo1

Reputation: 5751

If you set the parameter resumable to true, the Frontier as well as the DocIdServer will store their queues on the user-defined storage folder.

This works either for a crash or for a programmatic shutdown. In both cases, the storage folder must be the same.

See also the related issue on the offical issue tracker

Upvotes: 2

Related Questions