Reputation: 295
I have a loop. Everytime the loop runs, a new list is created. I want to add the all these lists together. My code is as follows:
while i < len(symbolslist):
html_text = urllib.urlopen('my-url.com/'+symbolslist[i]).read()
pattern = re.compile('<a target="_blank" href="(.+?)" rel="nofollow"')
applink = re.findall(pattern, htmltext)
applink += applink
i+=1
where applink is a list. However, with the current code I have, it only adds the last two lists together. What am I doing wrong?
Thanks!
Upvotes: 0
Views: 81
Reputation: 90879
The issue is that you are using applink
as the variable name to store the list returned by re.findall()
, hence you are ending up creating a new list everytime, instead of that use a different name and then extend applink to include the new list (or use +=
).
Code -
applink = []
while i<len(symbolslist):
url = "http://www.indeed.com/resumes/-/in-Singapore?co=SG&start="+str(symbolslist[i])
htmlfile = urllib.urlopen(url)
htmltext = htmlfile.read()
regex = '<a target="_blank" href="(.+?)" rel="nofollow"'
pattern = re.compile(regex)
tempapplink = re.findall(pattern,htmltext)
print tempapplink
applink += tempapplink
i+=1
Upvotes: 2