Reputation: 3
(Using python 3.3.2) Hi, I'm trying to make a crawling function for a text cloud, which would go into a list of links and ideally return a list of the function's output for each element in that list. However, I'm stuck using a print function, print(b), instead of actually returning what I want. In my for loop, how would I return everything I would get from my print(b) statement. It can all be in one list or compiled some way or another. Thank you :) tl;dr: how do I return all the stuff I get from a for loop
def crawl():
linkList = inputFunction()[1:][0] #makes a list of a bunch of URL's
for i in range(len(linkList)):
print(i)
t = getHTML(linkList[i]) #getHTML returns tuple of text in the input URL
alreadyCrawl = alreadyCrawl + list(linkList[i]) #ignore this
t = list(t)
b = counting(t) #makes dictionary of word counts
print(b)
return
Upvotes: 0
Views: 713
Reputation: 32449
Either you put them in a list and return the list at the end, or you "yield" them (hence creating a generator).
First way:
def f():
acc = []
for x in range(10):
acc.append(someFunctionOfX(x))
return acc
Second way:
def g():
for x in range(10):
yield someFunctionOfX(x)
Maybe the most important difference is the following: If any call to someFunctionOfX
causes an exception in example 1, the function won't return anything. In example 2 if let's say the 5th value cannot be yielded for some reason, the previous four have already been yielded and probably used in the caller's context.
Here you can see the difference:
def f():
acc = []
for x in range(-3, 4):
acc.append (2 / x)
return acc
def g():
for x in range(-3, 4):
yield 2 / x
def testF():
for x in f(): print(x)
def testG():
for x in g(): print(x)
Calling testF
simply fails (ZeroDivisionError: division by zero) and doesn't print anything. Calling testG
prints
-0.6666666666666666
-1.0
-2.0
and fails then (ZeroDivisionError: division by zero).
My (very personal) criterion for either returning a list or yielding values is the following: If I need the data stored somewhere, I return a list. If I just need to process each member, I yield them.
Upvotes: 8
Reputation: 239
def crawl():
linkList = inputFunction()[1:][0] #makes a list of a bunch of URL's
return_list = []
for i in range(len(linkList)):
print(i)
t = getHTML(linkList[i]) #getHTML returns tuple of text in the input URL
alreadyCrawl = alreadyCrawl + list(linkList[i]) #ignore this
t = list(t)
b = counting(t) #makes dictionary of word counts
return_list.append(b)
return return_list
Upvotes: 0
Reputation: 465
You can return list of values that you want.
def crawl():
list_ret = [] #create empty list to store values
for i in range(len(linkList)):
# do some stuff
b = counting(t) #makes dictionary of word counts
list_ret.append(b) #append value to list
print(b)
return list_ret #return list of values
Upvotes: 0