Reputation: 9
I want to remove the domain in an url For e.g. User entered www.google.com But I only need www.google
How to do this in python? Thanks
Upvotes: 0
Views: 3422
Reputation: 3452
If you want to remove 4 characters at the end, slice it
url = 'www.google.com'
cut_url = str[:-4]
# output : 'www.google'
More advanced answer
If you have a list of all the possible domains domains
:
domains = ['com', 'uk', 'fr', 'net', 'co', 'nz'] # and so on...
while True:
domain = url.split('.')[-1]
if domain in domains:
url = '.'.join(url.split('.')[:-1])
else:
break
Or if, for example, you have a domains list where .co
and .uk
are not separated:
domains = ['.com', '.co.uk', '.fr', '.net', '.co.nz'] # and so on...
for domain in domains:
if url.endswith(domain):
cut_url = url[:-len(domain)]
break
else: # there is no indentation mistake here.
# else after for will be executed if for did not break
print('no known domain found')
Upvotes: 0
Reputation: 13
What you need here is rstrip
function.
Try this code:
url = 'www.google.com'
url2 = 'www.google'
new_url = url.rstrip('.com')
print (new_url)
new_url2 = url2.rstrip('.com')
print (new_url2)
rstrip
will only strip if the string is present, in this case ".com". If not, it will just leave it. rstrip
is for stripping 'right-most' matched string and lstrip
is the opposite of this. Check these docs.
Also check strip and lstrip functions.
As @SteveJessop pointed out that the above example is NOT the right solution so i'm submitting another solution, though it's related to another answer here, it does check first if the string ends with a '.com'.
url = 'www.foo.com'
if url.endswith('.com'):
url = url[:-4]
print (url)
Upvotes: -1
Reputation: 639
To solve this without having the problem of dealing with domain name, you can look for the dots from left hand side and stop at the second dot.
t = 'www.google.com'
a = t.split('.')[1]
pos = t.find(a)
t = t[:pos+len(a)]
>>> 'www.google'
Upvotes: 2
Reputation: 37033
This is a very general question. But the narrowest answer would be as follows (assuming url
holds the URL in question):
if url.endswith(".com"):
url = url[:-4]
If you want to remove the last period and everything to the right of it the code would be a little more complicated:
pos = url.rfind('.') # find rightmost dot
if pos >= 0: # found one
url = url[:pos]
Upvotes: 3