Reputation: 549
The purpose is to scrape a website using Scrapy. The things I want to get are each between div with unique 4-digit numeric id as the following:
<div id="3456" ...> Item 1 </div>
<div id="5643" ...> Item 2 </div>
<div id="8767" ...> Item 3 </div>
I need to know how to generically define the 4-digit number in the following command, so I can go to each Item for scraping.
for sel in response.xpath('//div[@id="4-digit-number-description"]'):
Upvotes: 1
Views: 295
Reputation: 473893
With Scrapy, you can use regular expressions inside the XPath expressions, very convenient:
response.xpath('//div[re:test(@id, "\d{4}")]')
Upvotes: 1