Reputation: 27
With BeautifulSoup, I'm trying to print text that's inside a specific tag, the problem is the text I want to print is inside a tag within a <tr>
tag and the web page has 30 <tr>
tags.
The text I need to print is in the second <td>
tag inside the 19th occurrence of the <tr>
tag.
It looks like this:
<tr>...</tr>
<tr>...</tr>
<tr>
<td class="QL">Text1</td>
<td class="QL">Text2</td>
<td class="QL">Text3</td>
</tr>
<tr>...</tr>
<tr>...</tr>
I want to print Text2.
Here's my try at it:
from urllib.request import urlopen
from bs4 import BeautifulSoup
quote_page = 'http://google.com'
page = urlopen(quote_page)
soup = BeautifulSoup(page, 'html.parser')
for link in soup.find("td", {"class": "QL"}):
print(link)
As it is, it's printing the first occurrence of the <td class="QL">
tag. How do I make it print the text inside the 19th occurrence of that tag and without having Text1 and Text3 print as well?
Upvotes: 2
Views: 1777
Reputation: 1991
Can be this.
result = [ x.text for x in soup.select('tr > td:nth-of-type(2)')]
Upvotes: 0
Reputation: 7238
As you know the exact positions of the tags you want to find, you can use find_all()
which returns a list and then get the tag from the required index.
In this case, (19th <tr>
and 2nd <td>
) use this:
result = soup.find_all('tr')[18].find_all('td')[1].text
Upvotes: 1
Reputation: 71451
You can use enumerate
with find_all
:
result = [a.text for i, a in enumerate(soup.find_all("td", {"class": "QL"}), start=1) if i == 19][0]
Upvotes: 1