Scrapy: Is it possible scrapy with captcha?

I want to scrape this site

but it has captcha protection.

There is some way to mark the button:

"I'm not a robot" with Python Scrapy?

Upvotes: 0

Views: 1873

Answers (1)

user12320641
user12320641

Reputation:

This happens when you make frequent request to a webpage. Scrapy is not a browser automation tool. It just requests a page and parses html. In your problem if you want to fill captcha programmatically you can use selenium. But that is so heavy and a burden on RAM.

The solution is to use proxy or user agent rotation . For example:-

user-agents=['mozilla 1/0', 'googlebot']

And choose random user-agent like:-

random_agent=random.choice(user_agent)

Now you use the generated user agent while requesting a page.

Scrapy also provide many middlewares for this purpose. https://doc.scrapy.org/en/1.4/topics/spider-middleware.html

List of user agents:- https://deviceatlas.com/blog/list-of-user-agent-strings

Web crawlers uses such techniques Cheers!

Upvotes: 2

Related Questions