Remove first tag html using python & scrapy

Question

I have a HTML:

I used: response.xpath('//div[contains(@class,"abc")]/div[contains(@class,"xyz")]').extract()

Result:

u'['
        
        text
        text
        text
        text
    ']

I want remove

. May you help me?

alecxe · Accepted Answer

You can get all the child tags except the div with class="needremove":

response.xpath('//div[contains(@class, "abc")]/div[contains(@class, "xyz")]/*[local-name() != "div" and not(contains(@class, "needremove"))]').extract()

Demo from the shell:

$ scrapy shell index.html
In [1]: response.xpath('//div[contains(@class, "abc")]/div[contains(@class, "xyz")]/*[local-name() != "div" and not(contains(@class, "needremove"))]').extract()
Out[1]: [u'text
', u'text
', u'text
', u'text']

Remove first tag html using python & scrapy

Answers (1)

Related Questions

Remove first tag html using python &amp; scrapy

Answers (1)

Related Questions

Remove first tag html using python & scrapy