akshay202
akshay202

Reputation: 606

Setting cookie header in Apache Nutch

I want to crawl a specific site which uses cookies for authentication. I want to set cookie and user-agent information in every GET request that Apache Nutch makes for crawling the site.

How do I specify the cookie information in the config or is there the need for writing a custom plugin for this purpose?

Upvotes: 1

Views: 531

Answers (1)

Jorge Luis
Jorge Luis

Reputation: 3253

At the moment there is not way of manually specifying a cookie/header for Nutch to send when fetching the URLs. The plugin protocol-httpclient have some support for form based authentications, take a look at the httpclient-auth.xml file. I don't think this would be too hard to implement, and we always welcome contributions.

Upvotes: 1

Related Questions