Reputation: 5302
I am trying to collect Captcha images using the Python requests module and save them to file or load in memory for further processing, but nothing works as below.
Code so far, I tried.
import requests
url = 'https://dpdc.org.bd/site/application/libs/captcha/simple-php-captcha.php?_CAPTCHA&t=0.84582400+1651208314'
r = requests.get(url)
with open('file.png', 'wb') as f:
f.write(requests.get(url).content)
The site I am working with URL
N.B. I tried with request.Session()
too, but all went in vain. I am trying to avoid heavy-weight Selenium though selenium can do the job. I will save the Captcha image and solve it using Keras, but I am halted at the Captcha image collection step.
Upvotes: 0
Views: 771
Reputation: 11829
CAPTHA generates by PHP script that requires a session.
You need to execute 2 requests: 1 for the cookie, 2 for the image with first request cookie.
import requests
form_url = 'https://dpdc.org.bd/site/service/ebill_gov/';
form_response = requests.get(form_url)
# missing code to parse captcha url
# ...
captcha_url = 'https://dpdc.org.bd/site/application/libs/captcha/simple-php-captcha.php?_CAPTCHA&t=0.22003900+1651258878'
captcha_response = requests.get(captcha_url, cookies=form_response.cookies)
with open('file.png', 'wb') as f:
f.write(captcha_response.content)
Upvotes: 1