anthony-dandrea
anthony-dandrea

Reputation: 2833

Making multiple calls with asyncio and adding result to a dictionary

I am having trouble wrapping my ahead around Python 3's Asyncio library. I have a list of zipcodes and I am trying to make async calls to an API to get each zipcodes corresponding city and state. I can do it successfully in sequence with a for loop but I want to make it faster in the case of a big zipcode list.

This is an example of my original that works

import urllib.request, json

zips = ['90210', '60647']

def get_cities(zipcodes):
    zip_cities = dict()
    for idx, zipcode in enumerate(zipcodes):
        url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true'
        response = urllib.request.urlopen(url)
        string = response.read().decode('utf-8')
        data = json.loads(string)
        city = data['results'][0]['address_components'][1]['long_name']
        state = data['results'][0]['address_components'][3]['long_name']
        zip_cities.update({idx: [zipcode, city, state]})
    return zip_cities

results = get_cities(zips)
print(results)
# returns {0: ['90210', 'Beverly Hills', 'California'],
#          1: ['60647', 'Chicago', 'Illinois']}

This is my terrible non-functional attempt at trying to make it async

import asyncio
import urllib.request, json

zips = ['90210', '60647']
zip_cities = dict()

@asyncio.coroutine
def get_cities(zipcodes):
    url = 'http://maps.googleapis.com/maps/api/geocode/json?address='+zipcode+'&sensor=true'
    response = urllib.request.urlopen(url)
    string = response.read().decode('utf-8')
    data = json.loads(string)
    city = data['results'][0]['address_components'][1]['long_name']
    state = data['results'][0]['address_components'][3]['long_name']
    zip_cities.update({idx: [zipcode, city, state]})

loop = asyncio.get_event_loop()
loop.run_until_complete([get_cities(zip) for zip in zips])
loop.close()
print(zip_cities) # doesnt work

Any help is much appreciated. All of the tutorials I've come across online seem to be a tad over my head.

Note: I've seen some examples use aiohttp. I was hoping to stick with the native Python 3 libraries if possible.

Upvotes: 16

Views: 13536

Answers (2)

Padraic Cunningham
Padraic Cunningham

Reputation: 180550

Not done much with asyncio but asyncio.get_event_loop() should be what you need, you also obviously have to change what your function takes as arguments and use asyncio.wait(tasks) as per the docs:

zips = ['90210', '60647']
zip_cities = dict()

@asyncio.coroutine
def get_cities(zipcode):
    url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdefg&address='+zipcode+'&sensor=true'
    fut = loop.run_in_executor(None,urllib.request.urlopen, url)
    response = yield  from fut
    string = response.read().decode('utf-8')
    data = json.loads(string)
    city = data['results'][0]['address_components'][1]['long_name']
    state = data['results'][0]['address_components'][3]['long_name']
    zip_cities.update({idx: [zipcode, city, state]})

loop = asyncio.get_event_loop()
tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
print(zip_cities) # doesnt work
{0: ['90210', 'Beverly Hills', 'California'], 1: ['60647', 'Chicago', 'Illinois']}

I don't have >= 3.4.4 so I had to use asyncio.async instead of asyncio.ensure_future

Or change the logic and create the dict from task.result from the tasks:

@asyncio.coroutine
def get_cities(zipcode):
    url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdefg&address='+zipcode+'&sensor=true'
    fut = loop.run_in_executor(None,urllib.request.urlopen, url)
    response = yield  from fut
    string = response.read().decode('utf-8')
    data = json.loads(string)
    city = data['results'][0]['address_components'][1]['long_name']
    state = data['results'][0]['address_components'][3]['long_name']
    return [zipcode, city, state]

loop = asyncio.get_event_loop()
tasks = [asyncio.async(get_cities(z)) for z in zips]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
zip_cities = {i:tsk.result() for i,tsk in enumerate(tasks)}
print(zip_cities)
{0: ['90210', 'Beverly Hills', 'California'], 1: ['60647', 'Chicago', 'Illinois']}

If you are looking at outside modules there is also a port of requests that works with asyncio.

Upvotes: 3

dano
dano

Reputation: 94981

You're not going to be able to get any concurrency if you use urllib to do the HTTP request, because it's a synchronous library. Wrapping the function that calls into urllib in a coroutine doesn't change that. You have to use an asynchronous HTTP client that's integrated into asyncio, like aiohttp:

import asyncio
import json
import aiohttp

zips = ['90210', '60647']
zip_cities = dict()

@asyncio.coroutine
def get_cities(zipcode,idx):
    url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true'
    response = yield from aiohttp.request('get', url)
    string = (yield from response.read()).decode('utf-8')
    data = json.loads(string)
    print(data)
    city = data['results'][0]['address_components'][1]['long_name']
    state = data['results'][0]['address_components'][3]['long_name']
    zip_cities.update({idx: [zipcode, city, state]})

if __name__ == "__main__":        
    loop = asyncio.get_event_loop()
    tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()
    print(zip_cities)

I know you prefer to only use the stdlib, but the asyncio library doesn't include an HTTP client, so you'd have to basically re-implement pieces of aiohttp to recreate the functionality its providing. I suppose another option would be to make the urllib calls in a background thread, so that they don't block the event loop, but its kind of silly to do when aiohttp is available (and sort of defeats the purpose of using asyncio in the first place):

import asyncio
import json
import urllib.request
from concurrent.futures import ThreadPoolExecutor

zips = ['90210', '60647']
zip_cities = dict()

@asyncio.coroutine
def get_cities(zipcode,idx):
    url = 'https://maps.googleapis.com/maps/api/geocode/json?key=abcdfg&address='+zipcode+'&sensor=true'
    response = yield from loop.run_in_executor(executor, urllib.request.urlopen, url)
    string = response.read().decode('utf-8')
    data = json.loads(string)
    print(data)
    city = data['results'][0]['address_components'][1]['long_name']
    state = data['results'][0]['address_components'][3]['long_name']
    zip_cities.update({idx: [zipcode, city, state]})

if __name__ == "__main__":
    executor = ThreadPoolExecutor(10)
    loop = asyncio.get_event_loop()
    tasks = [asyncio.async(get_cities(z, i)) for i, z in enumerate(zips)]
    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()
    print(zip_cities)

Upvotes: 24

Related Questions