Reputation: 11
I made a program to find available /id/ on Steam using requests, but it takes a very long time. If anybody knows any way to make requests faster, please inform me of this.
w = open("not taken.txt", "a")
f = open("og_users.txt", "r")
def is_steam_customurl_taken(id):
r = requests.get("https://steamcommunity.com/id/%s" % id)
if ("The specified profile could not be found.".lower() in r.text.lower()):
return False
return True
lines = f.readlines()
for line in lines:
username = line.strip()
if is_steam_customurl_taken(username):
print("%s is taken" % username)
if not is_steam_customurl_taken(username):
w.write(username)
w.write("\n")
print("%s is not taken" % username)
w.close()
f.close()
Upvotes: 1
Views: 138
Reputation: 2027
If you have Steam IDs, see about obtaining a Steam Web API key and use a proper API (some sites have measures to detect and block web-scrapers). Their API has a players endpoint which allows you to submit 100 IDs per request.
If you just have the names tho, try using xml=1
query param (e.g. https://steamcommunity.com/id/eroticgaben?xml=1) for a much lighter response.
Upvotes: 1
Reputation: 474141
Your bottlenecks here are, basically, two things:
There are a couple of easy wins you can get to improve your current "synchronous" approach:
instantiate a requests.Session()
and re-use for your network requests. This should speed things up significantly as you are making requests to the same host:
if you’re making several requests to the same host, the underlying TCP connection will be reused, which can result in a significant performance increase
do not call is_steam_customurl_taken()
twice per single row. Do it once and remember the result into a variable:
is_taken = is_steam_customurl_taken(username)
if is_taken:
print("%s is taken" % username)
else:
w.write(username + "\n")
print("%s is not taken" % username)
As far as making things asynchronous and non-blocking, you can look into packages like grequests
or Scrapy
which would allow you to not wait on the network and process more usernames at a time.
Upvotes: 3