capiono
capiono

Reputation: 2997

extract text with xpath from multiple sources

I built a scraper to extract text fom 3 sites for my project. I want to use a single spider for the 3 sites. 2 of the sites has it contents in this structure:

<div id="site1">
   <p> this is a test </p>
</div>

<div id="site2">
   <p> this is a test </p>
</div>

and one has this:

<div class="site3">
   <p> <span> this is a test </span> </p>
</div>

I can extract the text from the 2 sites using this:

response.xpath('//div[@id="site1" or @id="site2" or @class="site3"]//p/text()').extract()

How do I modifiy this code to pull text from site3?

Upvotes: 0

Views: 58

Answers (1)

JavaPhySel
JavaPhySel

Reputation: 81

response.xpath('//div[@id="site1" or @id="site2"]//p/text() | //div[@class="site3"]//p/span/text()').extract() 

Upvotes: 1

Related Questions