Yunti
Yunti

Reputation: 7458

Beautiful soup - capture all links with a certain class or text

I'm trying to capture all the relevant links from a web page with beautiful soup. All the links I need have both the class="btn btn-gray" and also the text <a...>More Info<>

What's the best way to extract just these links?

Upvotes: 4

Views: 2048

Answers (2)

supermitch
supermitch

Reputation: 2962

How about this?

soup = BeautifulSoup(html, 'lxml')

all_links = []
links = soup.find_all('a', {'class': ['btn', 'btn-gray']})
for link in links:
    if 'More Info' in link.text:
        all_links.append(link['href'])  # Save href only, for example.

or as a clean list comprehension:

links = soup.find_all('a', {'class': ['btn', 'btn-gray']})
results = [link['href'] for link in links if 'More Info' in link.text]

Upvotes: 7

user2707389
user2707389

Reputation: 827

buttons = soup.findAll('a', {'class': ['btn', 'btn-gray']})

links = [link for link in buttons if "More Info" in link.text]

Upvotes: 2

Related Questions