Reputation:
I'm trying to perform about 100k GET requests and parse the response body of each request. I thought grequests would be a good way to go, but I'm getting errors related to 'too many open files'. Here's the code:
import grequests
with open("./100k-sites.csv", "r") as f:
urls = ["http://" + line.rstrip() for line in f]
rs = (grequests.get(u, timeout=1) for u in urls)
responses = grequests.map(rs)
for r in responses:
try:
# do something with the response body
except:
pass
anyone got experience with this? The error I'm getting is:
requests.packages.urllib3.connection.HTTPConnection object at 0x7f817ab36898>: Failed to establish a new connection [Errno 24] Too many open files
Upvotes: 2
Views: 1365
Reputation: 9
Use imap instead of map
for resp in grequests.imap(rs, size=20):
pass
and then you will not have such problems with processes and memory. but keep in mind that imap returns a generator
Upvotes: -1
Reputation: 797
Maybe it's only a workaround (as somebody in the discussion mentioned above says), but IMHO it's worth of writing here, that one can fix it by the two lines:
import resource
resource.setrlimit(resource.RLIMIT_NOFILE, (110000, 110000))
Upvotes: 1