Juanjo Conti
Juanjo Conti

Reputation: 30013

How can I use BeautifulSoup to find all the links in a page pointing to a specific domain?

How can I use BeautifulSoup to find all the links in a page pointing to a specific domain?

Upvotes: 5

Views: 2807

Answers (1)

viksit
viksit

Reputation: 7742

Use SoupStrainer,

from BeautifulSoup import BeautifulSoup, SoupStrainer
import re

# Find all links
links = SoupStrainer('a')
[tag for tag in BeautifulSoup(doc, parseOnlyThese=links)]

linkstodomain = SoupStrainer('a', href=re.compile('example.com/'))

Edit: Modified example from official doc.

Upvotes: 8

Related Questions