Reputation: 45
i am writting an small crawler that extract some 5 to 10 sites while getting the links i am getting some urls like this
../tets/index.html
if it is /test/index.html
we can add with base url http://www.example.com/test/index.html
what can i do for this kind of urls.
Upvotes: 2
Views: 307
Reputation: 46692
Use dirname()
to get base directoy, remove the ..
using substr()
and append it there. Like this:
<?php
$url = "../tets/index.html";
$currentURL = "http://example.com/somedir/anotherdir";
echo dirname($currentURL).substr($url, 2);
?>
This outputs:
Upvotes: 0
Reputation: 23255
Url like these are relative urls . ".." means "parent directory", whereas "." simply means "this directory", as in bash. For instance, if you are looking at this page : http://www.someserver/test/foo/bar.html , and there is an url like this in it : "../baz/foobar.html", it will in fact point to http://www.someserver/test/baz/foobar.html I think. Just test.
Upvotes: 1