Python web scraping: how to ignore children elements

Elements

I am trying to get a text from the 2nd "p class="text-muted""

Generally I use:

outline = soup.find_all("p", {"class":"text-muted"})

for item in outline:
    print (item.text)

or

print (item.contents[3].text)

1, 2, 3 in whichever I can find what I'm looking for. But there are 2 "class":"text-muted" now. First code prints everything in the element with its children and the text I want. Also when I add .contents[0] or 1, 2, 10, ... I'm getting IndexError: list index out of range.

How do I print only the text I want directly in <"p", {"class":"text-muted"> and just ignore all the children?

Upvotes: 0

Views: 1020

Answers (1)

Raj Damani
Raj Damani

Reputation: 792

soup.find_all("p",{"class":"text-muted"},text=True,recursive=False)

It returns the text value for that child only and does not consider child elements.

Upvotes: 1

Related Questions