Reputation: 178
I'm trying to scrape links like <a href="http://www.example.com/default.html">Example
I'd like to load them into a dictionary as {Example:link}
where the link has the HTML tags stripped and is like the link someone would click.
I know how to get the links, I'm just not sure how to keep the links connected to the displayed text.
Upvotes: 2
Views: 243
Reputation: 474003
Generally, if you are able to extract href
values, making a dictionary to map texts to links is a matter of a few extra things you need: making a dictionary and getting a text of an element. And, as you get the links and texts from the same element, you may use a dictionary comprehension.
Working example:
from bs4 import BeautifulSoup
html = """
<div>
<a href="https://google.com">Google</a>
<a href="https://stackoverflow.com">Stackoverflow</a>
</div>
"""
soup = BeautifulSoup(html, "html.parser")
print({
a.get_text(strip=True): a["href"]
for a in soup.find_all("a")
})
Prints:
{
'Google': 'https://google.com',
'Stackoverflow': 'https://stackoverflow.com'
}
Upvotes: 1