Kyehee Kim
Kyehee Kim

Reputation: 89

how can i get the href tag in beautifulsoup?

I am using beautifulsoup of python

<div class="test1">
   <a href="www.google.com" blur blur~> text </a>
</div>

<div class="test2">
   <a href="www.stackoverflow.com" blur blur~> text </a>
</div>

<div class="test3">
   <a href="www.msn.com" blur blur~> text </a>
</div>

<div class="test4">
   <a href="www.naver.com" blur blur~> text </a>
</div>

<div class="test5">
   <a href="www.ios.com" blur blur~> text </a>
</div>

like this situation, i wanna get a specific href info. For example how can i use the class name, when i need a href='www.ios.com'.

HTML file has more than 1000 'a' selector and included url address is dynamic.

how can i get this? please answer me T.T

Upvotes: 1

Views: 15673

Answers (2)

Govind Sharma
Govind Sharma

Reputation: 21

for item in results a = item.find("a") item_href = a['href'] print(item_href)

Upvotes: 0

furas
furas

Reputation: 142641

Full working example.

For example you can use select and CSS selectors like .class, #id and tag.

from bs4 import BeautifulSoup

content='''<div class="test1">
   <a href="www.google.com" blur blur~> text </a>
</div>

<div class="test2">
   <a href="www.stackoverflow.com" blur blur~> text </a>
</div>

<div class="test3">
   <a href="www.msn.com" blur blur~> text </a>
</div>

<div class="test4">
   <a href="www.naver.com" blur blur~> text </a>
</div>

<div class="test5">
   <a href="www.ios.com" blur blur~> text </a>
</div>'''

soup = BeautifulSoup(content, 'html.parser')

all_a = soup.select('.test5 a')

for a in all_a:
    print(a['href'])

# www.ios.com

http://www.crummy.com/software/BeautifulSoup/bs4/doc/

Upvotes: 7

Related Questions