M Talha Afzal
M Talha Afzal

Reputation: 241

Python - Get paragraph text (Web Scraping)

How can I get text from a paragraph before break tags like:

      <p align="right">
        <font size="3">
             ABC
         <br/>
             DEF
         <br/>
             FGH
         <br/>
             iJK
        </font>
      </p>

and save in an array like:

text[0] = "ABC"
text[1] = "DEF"
text[2] = "iJK"

I am currently using:

paragraph_text = soup.find('p')
print paragraph_text.text

But it will give me all the text of a paragraph.

Upvotes: 1

Views: 2325

Answers (1)

alecxe
alecxe

Reputation: 473763

Locate the p element and iterate over the .stripped_strings:

for text in soup.p.stripped_strings:
    print(text)

Prints:

ABC
DEF
FGH
iJK

Or, if you want a list:

texts = list(soup.p.stripped_strings)
print(texts)

Prints:

['ABC', 'DEF', 'FGH', 'iJK']

Upvotes: 1

Related Questions