Replace a particular substring in a list of string with another substring

Question

I am taking an example scenario for my question. If I have a list of URLs :

url_list=["https:www.example.com/pag31/go","https:www.example.com/pag12/go","https:www.example.com/pag0/go"]

I want to replace the substring in between ".com/" and "go"

For Eg. new url should look like

['https:www.example.com/home/go','https:www.example.com/home/go','https:www.example.com/home/go']

I have tried slicing and replacing based on index but couldn't get the required result for the whole list.

Any help is really appreciated. Thanks in advance.

PacketLoss · Accepted Answer

You can use regex sub() and list comprehension to apply your logic to every element of your list.

import re

url_list=["https:www.google.com/pag31/go","https:www.facebook.com/pag12/go","http:www.bing.com/pag0/go"]

pattern = r'(?<=com\/).*(?=\/go)'

result = [re.sub(pattern, 'home', url) for url in url_list]

This will match against any string where a value is found between com/ and /go. This will also ensure that we capture any website, regardless of http(s).

Output:

['https:www.google.com/home/go', 'https:www.facebook.com/home/go', 'http:www.bing.com/home/go']

Regex Explanation

The pattern r'(?<=com\/).*(?=\/go)' looks for the following:

(?<=com\/): Positive lookbehind to check if com/ prefixes our lookup

.*: Matches anything an infinite amount of times

(?=\/go): Positive look ahead to check if /go directly occurs after .*

This enables us to match any string between the positive checks. You can find a more in-depth explanation on the pattern here

Replace a particular substring in a list of string with another substring

Answers (2)

Related Questions