Edmond
Edmond

Reputation: 69

Python unquote URL but retain hyperlink

I have a URL written in Cyrillic. When inserting it in python, I get its encoded form:

link = https://dspace.udpu.edu.ua/bitstream/6789/1751/1/%D0%97%D0%B0%D0%B3%D0%B0%D0%BB%D1%8C%D0%BD%D0%BE%D0%BD%D0%B0%D1%83%D0%BA%D0%BE%D0%B2%D1%96%20%D0%BC%D0%B5%D1%82%D0%BE%D0%B4%D0%B8%20%D0%B4%D0%BE%D1%81%D0%BB%D1%96%D0%B4%D0%B6%D0%B5%D0%BD%D0%BD%D1%8F.pdf

To convert it into a hyperlink with Cyrillic characters, I have tried using urllib.parse.unquote:

from urllib.parse import unquote
unquoted_link = unquote(link)

The result for unquoted_link is:

https://dspace.udpu.edu.ua/bitstream/6789/1751/1/Загальнонаукові методи дослідження.pdf

I.e., I have the hyperlink only before the first space, and then it is plain text. How can I make this whole string a hyperlink, just as StackOverflow does in the first hyperlink I provided?

Upvotes: 3

Views: 215

Answers (1)

Christopher Peisert
Christopher Peisert

Reputation: 24104

The unquoted URL is a valid hyperlink. For example, when it is placed within quotes in an HTML link, it works as expected. The issue is that in certain contexts, whitespace is treated as a terminator, for example, in various terminals.

<a href="https://dspace.udpu.edu.ua/bitstream/6789/1751/1/Загальнонаукові методи дослідження.pdf">
    https://dspace.udpu.edu.ua/bitstream/6789/1751/1/Загальнонаукові методи дослідження.pdf</a>

Upvotes: 2

Related Questions