Reputation: 149
I have these two urls:
absolute_url = 'https://ciechgroup.com/en/relacje-inwestorskie/reports/current-reports'
relative_url = 'en/relacje-inwestorskie/reports/current-reports/2018/242018/'
And I'd like to join them to create this:
https://ciechgroup.com/en/relacje-inwestorskie/reports/current-reports/2018/242018/
However, urljoin doesn't join the urls together correctly:
from urllib.parse import urljoin
urljoin(absolute_url, relative_url)
>> https://ciechgroup.com/en/relacje-inwestorskie/reports/en/relacje-inwestorskie/reports/current-reports/2018/242018/
Do you know how I can achieve this without duplicating part of the url?
Upvotes: 1
Views: 1825
Reputation: 1364
urljoin
is doing what it's supposed to do. It's taking the "current path" of your absolute url (/en/relacje-inwestorskie/reports/
) as the base to which your relative url will be "relative to". The result is indeed /en/relacje-inwestorskie/reports/en/relacje-inwestorskie/reports/current-reports/2018/242018/
.
From your expected result, it seems that your relative_url
is actually an absolute path, so you need to prepend /
to it.
>>> absolute_url = 'https://ciechgroup.com/en/relacje-inwestorskie/reports/current-reports'
>>> relative_url = '/en/relacje-inwestorskie/reports/current-reports/2018/242018/'
>>> from urllib.parse import urljoin
>>> urljoin(absolute_url, relative_url)
'https://ciechgroup.com/en/relacje-inwestorskie/reports/current-reports/2018/242018/'
Upvotes: 1
Reputation: 1083
Prepend a /
in your relative_url
from urllib.parse import urljoin
absolute_url = 'https://ciechgroup.com/en/relacje-inwestorskie/reports/current-reports'
relative_url = '/en/relacje-inwestorskie/reports/current-reports/2018/242018/'
>>> urljoin(absolute_url, relative_url)
'https://ciechgroup.com/en/relacje-inwestorskie/reports/current-reports/2018/242018/'
Upvotes: 3