nutship
nutship

Reputation: 4924

Appending numbers to a list

urllist = ['http://example.com',
           'http://example1.com']
i = 0
while i < len(urllist):
    source = urllib.urlopen(urllist[i]).read()
    regex = '(\d{3})/">(\w+\s-\s\w+)</a>'  # e.g. '435', 'Tom-Jerry' 
    p = re.compile(regex)
    db = re.findall(p, source)
    db = [tuple(filter(None, t)) for t in db]   

    hero_id = []
    for j in db:
        hero_id.append(j[0])

    i += 1
print hero_id

Note that: db = [tuple(filter(None, t)) for t in db] db is a list of tuples like this: [('564', 'Tom', 'Jerry'), ('321', 'X-man', 'Hulk')]. Up the the hero_id = [] line everything works like a charm. The for foop needs to append every number (from every url from the urllist). It does partly its job. At the end hero_id list contains only numbers from the last url (the previous numbers are gone). Ideas?

Upvotes: 0

Views: 97

Answers (2)

Ionut Hulub
Ionut Hulub

Reputation: 6326

That's because you set hero_id to an empty list at every iteration in the 'while' (hero_id = [])

Place that just after i = 0

Or you can simplify the code like so:

urllist = ['http://example.com', 'http://example1.com']
hero_id = []
for url in urllist:
    db = re.findall('(\d{3})/">(\w+\s-\s\w+)</a>', urllib.urlopen(url).read(), re.DOTALL)
    for j in db:
        hero_id.append(tuple(filter(None, j))[0])
print hero_id

Upvotes: 4

Anurag Ramdasan
Anurag Ramdasan

Reputation: 4330

Since your hero_id is set in the while loop, it is over written at every iteration. Make your hero_id variable global and do not reset it.

hero_id = []
while ():
    #your code

Upvotes: 1

Related Questions