YourHomicidalApe
YourHomicidalApe

Reputation: 51

Has Captcha completely invalidated my Selenium script?

I have a webscraper I wrote with Python/Selenium that automatically reserves a spot at my gym for me every morning (You have to reserve at 7am and they fill up quick so I just automated it to run at 7 every day). It's been working well for me for a while but a couple days ago it stopped working. So I got up early and checked what was going on - to find that this gym has added Captcha to its reservation process.

Picture of the Captcha service

Does this mean that someone working on the website added a Captcha to it? Or is it Google-added? Regardless, am I screwed? Is there any way for my bot to get around Captcha?

I found that when I run the Selenium script the Captcha requires addition steps (i.e finding all the crosswalks), whereas when I try to reserve manually the Captcha is still there but it only requires me to click on it before moving on. Is this something I can take advantage of?

Thank you in advance for any help.

Upvotes: 0

Views: 365

Answers (2)

alinajafi
alinajafi

Reputation: 754

Try use https://github.com/dessant/buster to solve captcha

implementation in python selenium -> repository

Upvotes: 1

James Tollefson
James Tollefson

Reputation: 893

I've run into similar problems before. Sometimes you're just stuck and can't get past it. That's exactly what Captcha is meant to accomplish, after all.

However, I've found that sometimes the site will only present you with Captcha if it suspects based on your behavior that you are a bot. This can be partially overcome, especially if you're only making occasional calls to a site, by making your bot somewhat less predictable. I do this using np.random. I use a Poisson distribution to simulate user actions within the context of an individual session, since the time between actions is often Poisson distributed. And I randomize the time I log into a site by simply randomly choosing a time within a certain range. These simple actions are highly effective, although eventually most sites will figure out what you're doing.

Before you implement either of these solutions, however, I strongly recommend you read the site's Terms of Use and consider whether overcoming their Captcha is a violation. If you signed a use agreement with them the right thing to do is to honor it, even if it's somewhat inconvenient. I'd argue this separate ethical decision is of much greater importance than the technical challenge of trying to bypass their Captcha.

Upvotes: 2

Related Questions