Reputation: 61
Im trying to use Privoxy and Tor to rotate my IP address in order to scrape a site without getting ban by IP.
So I installed Tor with sudo apt intall tor
and then modified the /etc/tor/torrc
file enabling these lines:
SocksPort 9050
ControlPort 9051
HashedControlPassword 16:A...
CookieAuthentication 1
Same for Privoxy sudo apt install privoxy
and then sudo vim /etc/privoxy/config
where I added forward-socks5 / 127.0.0.1:9050 .
Then, following this article, I created a rotate.py file which will be rotating my IP address every certain minutes. It looks like this:
import time
from stem import Signal
from stem.control import Controller
def main():
while True:
time.sleep(60*10)
print ("Rotating IP")
with Controller.from_port(port = 9051) as controller:
controller.authenticate(password='mylovelypassword')
controller.signal(Signal.NEWNYM)
if __name__ == '__main__':
main()
On the other hand, Im performing a python POST request to get the data that I need and it looks like this:
final_cookie = get_cookies()
url_base = 'http://...'
url_string = '...'
headers = {
...
"User-Agent": """Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36"""
}
proxies = {"http": "127.0.0.1:8118"}
data = requests.post(url_base, headers=headers, data=url_string, verify=False, proxies=proxies)
So I first run the rotate.py script and then I run my scraper that will be performing those POST requests.
The issue is that Im getting status code: 503
everytime. If I just do a normal request like data = requests.post(url_base, headers=headers, data=url_string, verify=False)
it will actually get the data (200 response), but it will get blocked after a certain amount of requests.
So what could be causing that my Privoxy-Tor setup is just getting 503 error reponses? Is it something with the services configuration? Any advice/hint will be very appreciated by me :) Cheers!!
Upvotes: 1
Views: 465