Thoman
Thoman

Reputation: 792

Scrape page on download site to extract specific URLs

On a download site, I want to scrape all the URLs for the mirror sites. I am using PHP.

For example, on this page:

http://drivers.softpedia.com/progDownload/Gigabyte-GA-P55A-UD3-rev-10-Intel-SATA-RAID-Preinstall-Driver-9501037-Download-99091.html

I want to extract the following URLs:

http://drivers.softpedia.com/dyn-postdownload.php?p=99091&t=0&i=1
http://drivers.softpedia.com/dyn-postdownload.php?p=99091&t=0&i=2

Upvotes: 0

Views: 270

Answers (2)

ashein
ashein

Reputation: 487

It is unclear where you got the "t" and "i" parameters from the source url, it only contains the id (p). The below should do for retrieving that last group of digits.

%(\d+)\.html$%

Upvotes: 0

hsz
hsz

Reputation: 152226

Try with:

(http:\/\/drivers\.softpedia\.com\/dyn-postdownload\.php\?p=\d+&t=\d+&i=\d+)

Upvotes: 1

Related Questions