Reputation: 1467
I have a script that runs 24/7 and is sometimes killed by the system-reboot. One portion of the scripts collects bins from pastebin[.]com with certain contents and the other one exports them to remote rest endpoint. The part where I collect bins sends a lot of requests and never bumps into the issues with HTTPConnectionPool
, while the other part tends to run into it pretty quickly despite the fact it sends request much less often.
I have following code with retry-logic, so I ensure the bin gets exported to remote
def send_export_request(self, payload):
while True:
success = False
try:
self.session.post(self.collector, data=payload, timeout=10)
success = True
except requests.exceptions.RequestException as e:
self.logger.log_error("RequestException ocurred when storing paste %s: %s" % (payload['key'], e))
if success:
break
self.logger.log("Retrying to store the paste...")
self.session.close()
self.session = requests.session()
sleep(2)
Of course self.session
is initialized in constructor to requests.session()
. What eventually always happens (the amount of time differs from case to case, but it always happens under 24 hours) is that the following exception is raised:
HTTPConnectionPool(host='www.[redacted].com', port=80): Read timed out. (read timeout=10)
And the code goes into the loop, always raising this exception, logging it, waiting 2 seconds, trying again, raising the exception and so on and so forth. It never recovers, unless I kill the script and run it again. I searched a lot, tried originally the code without a session (just post requests), then added the session and finally tried creating new session before retrying. None of that works. What am I missing?
Upvotes: 2
Views: 11789
Reputation: 1467
No wonder no one knew where the issue lies. I will answer this question to shed some light on what the problem was.
I did some further testing: The remote server to which I was posting the contents of bins had some sort of a IPS or similar system enabled. Collector is not (on purpose) behind HTTPS, so payload inspection was possible and when payload contained some keywords, or known signatures, remote server decided to let the connection timeout.
As not having the requests behind HTTPS is crucial for my use case (traffic sniffing and inspection must be possible to anyone), I figured a workaround: if request is killed by remote server, I base64 encode its body before retrying and then it works.
Upvotes: 2