Reputation: 26766
I've got some code which downloads a list of data from numerous URLs, then calls another function, passing in each result. Something like...
def ShowUrls(self, url):
Urls = self.Scraper.GetSubUrls(url)
for Url in Urls:
self.UI.addLink(
Url[0],
Url[1])
This works fine but there's a long delay while self.Scraper.GetSubUrls
runs, then all the UI calls are made very rapidly. This causes the UI to show "0 Urls added" for a long time, then complete.
What I'd like is to be able to pass the self.UI.addlink
method in to the self.Scraper.GetSubUrls
method so that it can be called as soon as each URL is retrieved. This should make the UI show the correct count as soon as each url is retrieved.
Is this possible? If so, what's the correct syntax?
If I were in Javascript, I'd do something like....
getSubUrls(url, function(x, y) {UI.addLink(x, y)})
and then, inside getSubUrls do
SomeParamMethod(Pram1, Param2)
Is this possible? If so, what's the correct syntax?
Upvotes: 5
Views: 4837
Reputation: 251548
This suggestion is a bit more involved, but if you control GetSubUrls
, a more Pythonic approach might be to make it a generator that yields each URL as it is retrieved. You can then process each URL outside the function in a for loop. For instance, I'm assuming GetSubUrls
presumably looks something vaguely like this:
def GetSubUrls(self, url):
urls = []
document = openUrl(url)
for stuff in document:
urls.append(stuff)
return urls
That is, it builds a list of URLs and returns the whole list. You can make it a generator:
def GetSubUrls(self, url):
document = openUrl(url)
for stuff in document:
yield stuff
Then you can just do
for url in self.Scraper.GetSubUrls(url):
self.UI.addlink(url[0], url[1])
Which is the same as before, but if GetSubUrls
is a generator, it doesn't wait to collect all the suburls and then return them. It just yields one at a time, and your code can likewise process them one at a time.
One advantage of this over passing a callback is that you can store the generator and use it whenever you want, instead of having the calls made inside GetSubUrls
. That is, you can do urls = GetSubUrls(url)
, save that for later, and still iterate over the URLs "on demand" at a later time, when they will be retrieved one by one. Using a callback approach forces the GetSubUrls
function to process all the URLs right away. Another advantage is that you needn't create a bunch of small callbacks with little content; instead you can write these one-liners naturally as the body of the for loop.
Read up on Python generators for more info on this (for instance What does the "yield" keyword do in Python? ).
Upvotes: 6