How to append findall results to list?

Question

I am trying to parse a website for all links that have the attribute nofollow. I want to print that list, one link by one. However I failed to append the results of findall() to my list box(my attempt is in brackets).

What did I do wrong?

import sys
import urllib2
from BeautifulSoup import BeautifulSoup


page = urllib2.urlopen(sys.argv[1]).read()
soup = BeautifulSoup(page)
soup.prettify()

box = []

for anchor in soup.findAll('a', href=True, attrs = {'rel' : 'nofollow'}):
#    box.extend(anchor['href'])
     print anchor['href']

# print box

Martijn Pieters · Accepted Answer

You are looping over soup.findAll so each anchor is not itself a list; use .append() for individual elements:

box.append(anchor['href'])

You could also use a list comprehension to grab all href attributes:

box = [a['href'] for a in soup.findAll('a', href=True, attrs = {'rel' : 'nofollow'})]

How to append findall results to list?

Answers (1)

Related Questions