mcbetz
mcbetz

Reputation: 2389

How to append findall results to list?

I am trying to parse a website for all links that have the attribute nofollow. I want to print that list, one link by one. However I failed to append the results of findall() to my list box(my attempt is in brackets).

What did I do wrong?

import sys
import urllib2
from BeautifulSoup import BeautifulSoup


page = urllib2.urlopen(sys.argv[1]).read()
soup = BeautifulSoup(page)
soup.prettify()

box = []

for anchor in soup.findAll('a', href=True, attrs = {'rel' : 'nofollow'}):
#    box.extend(anchor['href'])
     print anchor['href']

# print box

Upvotes: 0

Views: 1518

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1122182

You are looping over soup.findAll so each anchor is not itself a list; use .append() for individual elements:

box.append(anchor['href'])

You could also use a list comprehension to grab all href attributes:

box = [a['href'] for a in soup.findAll('a', href=True, attrs = {'rel' : 'nofollow'})]

Upvotes: 1

Related Questions