Reputation: 14998
I am trying to scrape (https://en.wikiquote.org/wiki/Remember_the_Titans#Coach_Boone), I want get quotes from all sections but Dialogue, Taglines and External Links. I can go to ul > li
but then it is fetching everything. How can I fetch ul > li
after the following html:
<h2><span class="mw-headline" id="Coach_Boone">Coach Boone</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Remember_the_Titans&action=edit&section=1" title="Edit section: Coach Boone">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
Upvotes: 2
Views: 57
Reputation: 474041
Once you've located the h2
element, use .find_next_siblings()
method to get the following ul
sibling elements:
h2 = soup.find("span", id="Coach_Boone").find_parent('h2')
for ul in h2.find_next_siblings("ul"):
for li in ul.find_all("li"):
print(li)
Upvotes: 2