Reputation: 143
I am trying to write python code to extract links from a web page. As per logic, I am looking
for the sequence <a href="">.
The code extracts the link address from a normal anchor tag like -
<a href="https://www.google.com"
, but I see that there are other ways of specifying hyperlinks
as under -
<a href="/news/">News</a>
<a href="/docs/">Documentation</a>
<a href="/downloads/">Downloads</a>
<a href="/support/">Support</a>
On clicking '/news/' the address that it resolves to is "https://www.reviewboard.org/news/".
How does this happen, and where is this information stored ?
Because '/news/' is useless by itself unless converted to complete string
https://www.reviewboard.org/news/.
Thanks
Upvotes: 0
Views: 201
Reputation: 1824
These are relative links. It's the link relative to the page where the link is found.
So if I am on www.somewebsite.com/somepage
, and I encounter this link:
<a href="/someotherpage/">Some other page</a>
It will take me to www.somewebsite.com/somepage/someotherpage
These work the same way a relative path works, including ../
syntax to point back up through the file structure.
Upvotes: 1