Reputation: 33
I already know that you can configure crawling to be resumable.
But is it possible to use resumable functionality to pause crawling process and then resume crawling later programmatically? E.g. I can gracefully shutdown
crawling with shutdown method of the crawler and with resumable parameter set to true
, then start again crawling.
Will it work this way, because primary purpose of resumable parameter is to handle accidental crashes of crawler. Is there any other or better way how to achieve this functionality with crawler4j?
Upvotes: 0
Views: 297
Reputation: 5751
If you set the parameter resumable to true
, the Frontier
as well as the DocIdServer
will store their queues on the user-defined storage folder.
This works either for a crash or for a programmatic shutdown. In both cases, the storage folder must be the same.
See also the related issue on the offical issue tracker
Upvotes: 2