LearnAWK
LearnAWK

Reputation: 549

How to express 4-digit number while using Xpath in Scrapy

The purpose is to scrape a website using Scrapy. The things I want to get are each between div with unique 4-digit numeric id as the following:

<div id="3456" ...> Item 1 </div>
<div id="5643" ...> Item 2 </div>
<div id="8767" ...> Item 3 </div>

I need to know how to generically define the 4-digit number in the following command, so I can go to each Item for scraping.

for sel in response.xpath('//div[@id="4-digit-number-description"]'):

Upvotes: 1

Views: 295

Answers (1)

alecxe
alecxe

Reputation: 473893

With Scrapy, you can use regular expressions inside the XPath expressions, very convenient:

response.xpath('//div[re:test(@id, "\d{4}")]')

Upvotes: 1

Related Questions