Reputation: 6872
I am creating a Python script which is calling the Instagram API and creating an array of all the photos. The API results are paginated, so it only shows 25 results. If there are more photos, it gives you a next_url
which contains the next batch.
I have the script made in PHP, and I am doing something like this in my function:
// loop through this current function again with the next batch of photos
if($data->pagination->next_url) :
$func = __FUNCTION__;
$next_url = json_decode(file_get_contents($data->pagination->next_url, true));
$func($next_url);
endif;
How can I do something like this in Python?
My function looks sort of like this:
def add_images(url):
if url['pagination']['next_url']:
try:
next_file = urllib2.urlopen(url['pagination']['next_url'])
next_json = f.read()
finally:
# THIS DOES NOT WORK
next_url = json.loads(next_json)
add_images(next_url)
return
But obviously I can't just call add_images() from within. What are my options here?
Upvotes: 0
Views: 62
Reputation: 15398
You can call add_images()
from within add_images()
. Last time I checked, recursion still works in Python ;-).
However, since Python does not support tail call elimination, you need to be wary of stack overflows. The default recursion limit for CPython is 1,000 (available via sys.getrecursionlimit()
), so you probably don't need to worry.
However, nowadays with generators and the advent of async
I'd consider such JavaScript style recursive callback calls unpythonic. You might instead consider using generators and/or coroutines:
def get_images(base_url):
url = base_url
while url:
with contextlib.closing(urllib2.urlopen(url)) as url_file:
json_data = url_file.read()
# get_image_urls() extracts the images from JSON and returns an iterable.
# python 3.3 and up have "yield from"
# (see https://www.python.org/dev/peps/pep-0380/)
for img_url in get_image_urls(json_data):
yield img_url
# dict.get() conveniently returns None or
# the provided default argument when the
# element is missing.
url = json_data.get('pagination', {}).get('next_url')
images = list(get_images(base_url));
Upvotes: 4