Scrapy scraping nested text using css selectors

Question

I have the following html code:


Lorem ipsum si ammet

So to get the text data as: Lorem ipsum si ammet, so I tried to use:

response.css('div.article >p::text ').extract()

But I only receive only lorem sie ammet.

How can I get both

and texts using CSS selectors?

Umair Ayub · Accepted Answer

One liner solution.

"".join(a.strip() for a in response.css("div.article *::text").extract())

div.article * means to scrape everything inside the div.article

Or an easy way to write it

text = ""
for a in response.css("div.article *::text").extract()
    text += a.strip()

Both approaches are same,

Answers (2)