rockstardev
rockstardev

Reputation: 13527

PHP: Convert url in html to fully-fledged url?

I am able to scrape a page for URLs, but I want to know what is the easiest way to convert the various formats that these links can be in, into a fully fledged url. For example:

If I scrape: www.mysite.com/some/place/in/space.html

And I get the following urls:

../img.jpg
img.jpg
../../bla.jpg
inc/bla.jpg
/
./

They should resolve to

www.mysite.com/some/place/img.jpg
www.mysite.com/some/place/in/img.jpg
www.mysite.com/some/bla.jpg
www.mysite.com/some/place/in/inc/bla.jpg
www.mysite.com/some/place/in/
www.mysite.com/some/place/in/

Is there a function that does this for all cases or is it something I would have to code?

Upvotes: 0

Views: 107

Answers (3)

dynamic
dynamic

Reputation: 48131

I use this function for a crawler i wrote long time ago: http://codepad.org/1VxMECNj

call the function with host prepended:

relativeUrl('http://host/dir/dir2/../../file.html');
//> returns http://host/file.html

Upvotes: 1

Stelian Matei
Stelian Matei

Reputation: 11623

You could do a REGEX to replace the relative links with the absolute URLs:

$data = preg_replace('#(href|src)="([^:"]*)("|(?:(?:%20|\s|\+)[^"]*"))#', '$1="' . $site_url . '$2$3', $data);

Upvotes: 0

Richard
Richard

Reputation: 4415

You can just add www.mysite.com/some/place/in/ in front of the urls.. www.mysite.com/some/place/in/../img.jpg should resolve I think.

Upvotes: 0

Related Questions