ishandutta2007
ishandutta2007

Reputation: 18184

Why is text within inner tag ignored, how to fix it?

<p>The latest media Tweets from Yohir Akerman (@yohirakerman). My bio changes all the time. /// akermancolumnista<strong>@gmail.com</strong>. Airplane</p>

I try to extract the entire text as follows:

    body = response.xpath('//*[@id="b_results"]/p/text()").getall()
    print(body)

The out put I get is:

['The latest media Tweets from Yohir Akerman (@yohirakerman). My bio changes '
 'all the time. /// akermancolumnista',
 '. Airplane']

The entire text within <strong> tag is ignore , how to fix it?

Upvotes: 0

Views: 31

Answers (1)

Psona
Psona

Reputation: 26

Dont use text() . Inside

body = response.xpath('//*[@id="b_results"]/p").getall()
    print(body)

Then join body and clean body of all tags.

Upvotes: 1

Related Questions