Reputation: 73
I want to know how to find a tag inside another tag.
The data source is like this:
<ul class="DB_su a1" style="display: none;">
<li><a href="link">text</a></li>
<li><a href="link2">text2</a></li>
<li><a href="link3">text3</a></li>
<li><a href="link4">text4</a></li>
<li><a href="link5">text5</a></li>
<li><a href="link6">text6</a></li>
</ul>
<ul class="DB_su a2" style="display: none;">
<li><a href="link">text</a></li>
<li><a href="link2">text2</a></li>
<li><a href="link3">text3</a></li>
<li><a href="link4">text4</a></li>
<li><a href="link5">text5</a></li>
<li><a href="link6">text6</a></li>
</ul>
<ul class="DB_su a3" style="display: none;">
<li><a href="link">text</a></li>
<li><a href="link2">text2</a></li>
<li><a href="link3">text3</a></li>
<li><a href="link4">text4</a></li>
<li><a href="link5">text5</a></li>
<li><a href="link6">text6</a></li>
</ul>
...
This is the Python code that I made by referring to HTML sources.
for flink in range(11):
count = str(flink + 1)
ss = soup.find('ul', class_='DB_su a' + count)
dd = ss.findAllNext('a')
print(dd)
This resulted in more results than the desired data. Not only the data inside the tag was collected, but all the next tags were collected.
I want to get this href
tag:
[link, link2, link3, link4, link5, link6]
Upvotes: 0
Views: 240
Reputation: 2094
In your "for flink in range(11)" try to add some like this :
from bs4 import BeautifulSoup
import re
html = """
<ul class="DB_su a1" style="display: none;">
<li><a href="link">text</a></li>
<li><a href="link2">text2</a></li>
<li><a href="link3">text3</a></li>
<li><a href="link4">text4</a></li>
<li><a href="link5">text5</a></li>
<li><a href="link6">text6</a></li>
</ul>
<ul class="DB_su a2" style="display: none;">
<li><a href="link">text</a></li>
<li><a href="link2">text2</a></li>
<li><a href="link3">text3</a></li>
<li><a href="link4">text4</a></li>
<li><a href="link5">text5</a></li>
<li><a href="link6">text6</a></li>
</ul>
<ul class="DB_su a3" style="display: none;">
<li><a href="link">text</a></li>
<li><a href="link2">text2</a></li>
<li><a href="link3">text3</a></li>
<li><a href="link4">text4</a></li>
<li><a href="link5">text5</a></li>
<li><a href="link6">text6</a></li>
</ul>
"""
soup = BeautifulSoup(html,'html.parser')
for n in soup.find_all('ul', attrs={'class': 'DB_su a3'}):
for x in n.find_all('a'):
print (x.get('href'))
result:
link
link2
link3
link4
link5
link6
Upvotes: 1