Mark K
Mark K

Reputation: 9348

BeautifulSoup to output extracted text line by line

HTML sample below. I am using BeautifulSoup to extract the texts.

txt = """[<dd class="qs" id="qsff"><br/>Pretty women wonder where my secret lies. <br/>I'm not cute or built to suit a fashion model's size<br/>But when I start to tell them,<br/>They think I'm telling lies.<br/><br/>I say,<br/>It's in the reach of my arms<br/>The span of my hips,<br/>The stride of my step,<br/>The curl of my lips.<br/><br/></dd>]"""

from bs4 import BeautifulSoup

soup = BeautifulSoup(txt, "lxml")

for node in soup:
    print (node.text)

# [Pretty women wonder where my secret lies. I'm not cute or built to suit a fashion model's sizeBut when I start to tell them,They think I'm telling lies.I say,It's in the reach of my armsThe span of my hips,The stride of my step,The curl of my lips.]

It shows me a whole chunk of string as above, but I want to have them line by line, like:

Pretty women wonder where my secret lies.
I'm not cute or built to suit a fashion model's size
But when I start to tell them,
....

I tried below but it doesn't work.

for node in soup.find_all('br'):
    print (node.text)

What's the right way to output them line by line?

Upvotes: 3

Views: 806

Answers (1)

DYZ
DYZ

Reputation: 57085

Iterate over strings, not nodes:

for node in soup.dd.strings:
    print(node)
#Pretty women wonder where my secret lies. 
#I'm not cute or built to suit a fashion model's size
#But when I start to tell them,
#....

And why do you enclose your text in square brackets?

Upvotes: 2

Related Questions