Reputation: 65
I want to find a specific tag in a html code like if there are 2 tags then how can i get the contents of the second tag and not the first one which soup.find(id='contact1') does here is the example html code
<table align="center"><th id="contact">STUDENT ID</th><th id="contact">NAME</th><th id="contact"> Phone </th><th id="contact"> NO.</th>
<p align="center" style="display:compact; font-size:18px; font-family:Arial, Helvetica, sans-serif; color:#CC3300">
</p><tr>
<td id="contact1">
2011XXA4438F </td> <td id="contact1"> SAM SRINIVAS KRISHNAGOPAL</td> <td id="contact1"> 9894398690 </td> <td id="contact1"> </td>
</tr>
</table>
What i want to do is to extract '2011XXA4438F' as a string how can i do this?
Upvotes: 2
Views: 5043
Reputation: 2663
You can also do it this way:
target = soup.find("table", {"id": "contact1"})
Upvotes: 0
Reputation: 873
I'm pretty sure .find only gives you the first element that matches your query. Try using .findAll instead.
Check documentation here - http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html
EDIT: Misread your post. Just to understand completely. Do you want to always find the 2nd occurance of "id='contact1'"?
There is probably something more elegant, but you could do something like
v = soup.find_all(id='contact1')
length = 0
for x in v:
length += 1
if length = 2: #set number according to which occurrence you want.
#here is the second occurrence of id='contact1'.
The above is completely non tested and just written directly here. And i've only just started using python, some there is probably a more efficient way of doing it :-)
Upvotes: 1
Reputation: 60024
<td id="contact1">
is the first tag with an id of "contact1"
. To obtain it, then soup.find
is all you need:
>>> print soup.find(id='contact1').text.strip()
2011XXA4438F
If you're looking for other tags, then you'll want to use find_all
:
>>> print soup.find_all(id='contact1')
[<td id="contact1">
2011XXA4438F </td>, <td id="contact1"> SAM SRINIVAS KRISHNAGOPAL</td>, <td id="contact1"> 9894398690 </td>, <td id="contact1"> </td>]
Upvotes: 4