EspenG
EspenG

Reputation: 537

FInd javascript-links with Python

Is there any way to find javascript-links on a webpage with python? I use mechanize and I can't find all the links I want. I want the url on the pictures on this site: http://500px.com/popular

Upvotes: 0

Views: 2384

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1125208

With just BeautifulSoup this is quite easy:

js_links = soup.select('a[href^="javascript:"]')

This selects all <a> elements that have a href attribute whose value starts with javascript::

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''\
... <html><body>
... <a href="http://stackoverflow.com">Not a javascript link</a>
... <a name="target">Not a link, no href</a>
... <a href="javascript:alert('P4wned');">Javascript link (with scary message)</a>
... <a href="javascript:return False">Another javascript link</a>
... </body></html>
... ''')
>>> for link in soup.select('a[href^="javascript:"]'):
...     print link['href'], link.get_text()
... 
javascript:alert('P4wned'); Javascript link (with scary message)
javascript:return False Another javascript link

Upvotes: 1

Related Questions