Reputation: 61
I want to replace part of the string to blank if present in a list.
For example :
List
foo = ['.com', '.net', '.co', '.in']
Convert these strings to
google.com
google.co.in
google.net
google.com/gmail/
These strings
google
google
google
google/gmail/
So far i have found this solution. Is there any other optimized way to do it?
replace item in a string if it matches an item in the list
Upvotes: 0
Views: 2801
Reputation: 136
Another alternative is to use str.replace()
and str.find()
.
foo = ['.com', '.net', '.co', '.in']
domains = ["google.com", "google.co.in", "google.net", "google.com/gmail/"]
def remove_extensions(domain, extensions):
for ext in extensions:
if domain.find(ext) != -1:
domain = domain.replace(ext, "")
return domain
list(map(lambda x: remove_extensions(x, foo), domains))
This code snippet outputs the result as expected:
['google', 'google', 'google', 'google/gmail/']
Upvotes: 0
Reputation: 71451
You can use re.sub
and str.join
:
import re
foo = ['.com', '.net', '.co', '.in']
urls = ["google.com","google.co.in","google.net","google.com/gmail/"]
final_result = [re.sub('|'.join(foo), '', i) for i in urls]
Output:
['google', 'google', 'google', 'google/gmail/']
Upvotes: 1
Reputation: 2937
Similar to George Shulkin's answer.
import re
suffixes = ['.com', '.co', '.in', '.net']
patterns = [re.compile(suffix) for suffix in suffixes]
def remove_suffixes(s: str) -> str:
for pattern in patterns:
s = pattern.sub("", s)
return s
# urls = ["google.com", ...
clean_urls = map(remove_suffixes, urls)
# or clean_urls = [remove_suffixes(url) for url in urls]
You might want to use the list comprehension, because it can be faster than map
in many cases.
This has the advantage of also compiling the regexes, which can be better for performance when used in a loop.
Or if you decided to use functools.reduce
,
from functools import reduce
def remove_suffixes(s: str) -> str:
return reduce(lambda s, pattern: pattern.sub("", s), patterns, s)
Upvotes: 1
Reputation: 715
Using the suggestion of George Shuklin this is the simplest code i could come up with.
import re
domains = ['.com', '.net', '.co', '.in']
urls = ["google.com","google.co.in","google.net","google.com/gmail/"]
for i in range(len(urls)):
for domain in domains:
urls[i] = re.sub(domain,"",urls[i])
print(urls)
This outputs:
['google', 'google', 'google', 'google/gmail/']
Upvotes: 0
Reputation: 7877
You need to split this task in two:
First can be done with regexp (see below). Second can be done by using map
function.
Example of the code to replace substring:
>>> import re
>>> re.sub(".com", "", "google.com/gmail/")
'google/gmail/'
Example for use of the map
function:
>>> map(lambda x: len(x), ["one", "two", "three"])
[3, 3, 5]
(it replaces elements of array with length of those elements).
You can combine those two to get what you want.
Upvotes: 0