Reputation: 2833
I am having trouble wrapping my ahead around Python 3's Asyncio library. I have a list of zipcodes and I am trying to make async calls to an API to get each zipcodes corresponding city and state. I can do it successfully in sequence with a for loop but I want to make it faster in the case of a big zipcode list.
This is an example of my original that works
import urllib.request, json
zips = ['90210', '60647']
def get_cities(zipcodes):
zip_cities = dict()
for idx, zipcode in enumerate(zipcodes):
url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true'
response = urllib.request.urlopen(url)
string = response.read().decode('utf-8')
data = json.loads(string)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
zip_cities.update({idx: [zipcode, city, state]})
return zip_cities
results = get_cities(zips)
print(results)
# returns {0: ['90210', 'Beverly Hills', 'California'],
# 1: ['60647', 'Chicago', 'Illinois']}
This is my terrible non-functional attempt at trying to make it async
import asyncio
import urllib.request, json
zips = ['90210', '60647']
zip_cities = dict()
@asyncio.coroutine
def get_cities(zipcodes):
url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true'
response = urllib.request.urlopen(url)
string = response.read().decode('utf-8')
data = json.loads(string)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
zip_cities.update({idx: [zipcode, city, state]})
loop = asyncio.get_event_loop()
loop.run_until_complete([get_cities(zip) for zip in zips])
loop.close()
print(zip_cities) # doesnt work
Any help is much appreciated. All of the tutorials I've come across online seem to be a tad over my head.
Note: I've seen some examples use aiohttp
. I was hoping to stick with the native Python 3 libraries if possible.
Upvotes: 16
Views: 13536
Reputation: 180550
Not done much with asyncio but asyncio.get_event_loop()
should be what you need, you also obviously have to change what your function takes as arguments and use asyncio.wait(tasks)
as per the docs:
zips = ['90210', '60647']
zip_cities = dict()
@asyncio.coroutine
def get_cities(zipcode):
url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdefg&address='+zipcode+'&sensor=true'
fut = loop.run_in_executor(None,urllib.request.urlopen, url)
response = yield from fut
string = response.read().decode('utf-8')
data = json.loads(string)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
zip_cities.update({idx: [zipcode, city, state]})
loop = asyncio.get_event_loop()
tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
print(zip_cities) # doesnt work
{0: ['90210', 'Beverly Hills', 'California'], 1: ['60647', 'Chicago', 'Illinois']}
I don't have >= 3.4.4 so I had to use asyncio.async
instead of asyncio.ensure_future
Or change the logic and create the dict from task.result from the tasks:
@asyncio.coroutine
def get_cities(zipcode):
url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdefg&address='+zipcode+'&sensor=true'
fut = loop.run_in_executor(None,urllib.request.urlopen, url)
response = yield from fut
string = response.read().decode('utf-8')
data = json.loads(string)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
return [zipcode, city, state]
loop = asyncio.get_event_loop()
tasks = [asyncio.async(get_cities(z)) for z in zips]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
zip_cities = {i:tsk.result() for i,tsk in enumerate(tasks)}
print(zip_cities)
{0: ['90210', 'Beverly Hills', 'California'], 1: ['60647', 'Chicago', 'Illinois']}
If you are looking at outside modules there is also a port of requests that works with asyncio.
Upvotes: 3
Reputation: 94981
You're not going to be able to get any concurrency if you use urllib
to do the HTTP request, because it's a synchronous library. Wrapping the function that calls into urllib
in a coroutine
doesn't change that. You have to use an asynchronous HTTP client that's integrated into asyncio
, like aiohttp
:
import asyncio
import json
import aiohttp
zips = ['90210', '60647']
zip_cities = dict()
@asyncio.coroutine
def get_cities(zipcode,idx):
url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true'
response = yield from aiohttp.request('get', url)
string = (yield from response.read()).decode('utf-8')
data = json.loads(string)
print(data)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
zip_cities.update({idx: [zipcode, city, state]})
if __name__ == "__main__":
loop = asyncio.get_event_loop()
tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
print(zip_cities)
I know you prefer to only use the stdlib, but the asyncio
library doesn't include an HTTP client, so you'd have to basically re-implement pieces of aiohttp
to recreate the functionality its providing. I suppose another option would be to make the urllib
calls in a background thread, so that they don't block the event loop, but its kind of silly to do when aiohttp
is available (and sort of defeats the purpose of using asyncio
in the first place):
import asyncio
import json
import urllib.request
from concurrent.futures import ThreadPoolExecutor
zips = ['90210', '60647']
zip_cities = dict()
@asyncio.coroutine
def get_cities(zipcode,idx):
url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true'
response = yield from loop.run_in_executor(executor, urllib.request.urlopen, url)
string = response.read().decode('utf-8')
data = json.loads(string)
print(data)
city = data['results'][0]['address_components'][1]['long_name']
state = data['results'][0]['address_components'][3]['long_name']
zip_cities.update({idx: [zipcode, city, state]})
if __name__ == "__main__":
executor = ThreadPoolExecutor(10)
loop = asyncio.get_event_loop()
tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
print(zip_cities)
Upvotes: 24