Reputation: 147
arg = urllib2.urlopen(argv[1]).read()
soup = BeautifulSoup(arg)
a_tags = soup.find_all('a') #so this stores a list with all the <a href="" /a> tags
and i need ONLY those that DO NOT LINK TO SAME PAGE (without the symbol # in href)
anyone pls....
Upvotes: 0
Views: 260
Reputation: 298364
You can match the href
attribute with a function:
for a in soup.find_all('a', href=lambda value: value.startswith('#')):
a.extract()
Upvotes: 2