Reputation: 189
I have the following html:
<div id="contentDiv">
<!-- START FILER DIV -->
<div style="margin: 15px 0 10px 0; padding: 3px; overflow: hidden; background-color: #BCD6F8;">
<div class="mailer">Mailing Address
<span class="mailerAddress">500 ORACLE PARKWAY</span>
<span class="mailerAddress">MAIL STOP 5 OP 7</span>
<span class="mailerAddress">REDWOOD CITY CA 94065</span>
</div>
I am trying to access "500 ORACLE PARKWAY" and "MAIL STOP 5 OP &", but I cannot find a way to do it. My attempt was this:
for item in soup.findAll("span", {"class" : "mailerAddress"}):
if item.parent.name == 'div':
return_list.append(item.contents)
Edit: I forgot to mention that there are elements after that in the html that use similar tags so it captures all of those when I just want the first two.
Edit: link: https://www.sec.gov/cgi-bin/browse-edgar?CIK=orcl
Upvotes: 1
Views: 3563
Reputation: 22440
Try this:
from bs4 import BeautifulSoup
import requests
res = requests.get("https://www.sec.gov/cgi-bin/browse-edgar?CIK=orcl").text
soup = BeautifulSoup(res,'lxml')
for item in soup.find_all(class_="mailerAddress")[:2]:
print(item.text)
Result:
500 ORACLE PARKWAY
MAIL STOP 5 OP 7
Upvotes: 1
Reputation: 104
I'm going to attempt to answer this with the little bit of information we have. If you just want the first two elements of a certain class on a webpage you can use slicing.
soup.findAll("span", {"class" : "mailerAddress"})[0:2]
Upvotes: 0