Reputation: 792
On a download site, I want to scrape all the URLs for the mirror sites. I am using PHP.
For example, on this page:
http://drivers.softpedia.com/progDownload/Gigabyte-GA-P55A-UD3-rev-10-Intel-SATA-RAID-Preinstall-Driver-9501037-Download-99091.html
I want to extract the following URLs:
http://drivers.softpedia.com/dyn-postdownload.php?p=99091&t=0&i=1
http://drivers.softpedia.com/dyn-postdownload.php?p=99091&t=0&i=2
Upvotes: 0
Views: 270
Reputation: 487
It is unclear where you got the "t" and "i" parameters from the source url, it only contains the id (p). The below should do for retrieving that last group of digits.
%(\d+)\.html$%
Upvotes: 0
Reputation: 152226
Try with:
(http:\/\/drivers\.softpedia\.com\/dyn-postdownload\.php\?p=\d+&t=\d+&i=\d+)
Upvotes: 1