Reputation: 89
I want to use a regular expression to detect and substitute some phrases. These phrases follow the same pattern but deviate at some points. All the phrases are in the same string.
For instance I have this string:
/this/is//an example of what I want /to///do
I want to catch all the words inside and including the // and substitute them with "".
To solve this, I used the following code:
import re
txt = "/this/is//an example of what i want /to///do"
re.search("/.*/",txt1, re.VERBOSE)
pattern1 = r"/.*?/\w+"
a = re.sub(pattern1,"",txt)
The result is:
' example of what i want '
which is what I want, that is, to substitute the phrases within // with "". But when I run the same pattern on the following sentence
"/this/is//an example of what i want to /do"
I get
' example of what i want to /do'
How can I use one regex and remove all the phrases and //, irrespective of the number of // in a phrase?
Upvotes: 0
Views: 1660
Reputation: 163437
In your example code, you can omit this part re.search("/.*/",txt1, re.VERBOSE)
as is executes the command, but you are not doing anything with the result.
You can match 1 or more /
followed by word chars:
/+\w+
Or a bit broader match, matching one or more /
followed by all chars other than /
or a whitspace chars:
/+[^\s/]+
/+
Match 1+ occurrences of /
[^\s/]+
Match 1+ occurrences of any char except a whitespace char or /
import re
strings = [
"/this/is//an example of what I want /to///do",
"/this/is//an example of what i want to /do"
]
for txt in strings:
pattern1 = r"/+[^\s/]+"
a = re.sub(pattern1, "", txt)
print(a)
Output
example of what I want
example of what i want to
Upvotes: 1
Reputation: 627020
You can use
/(?:[^/\s]*/)*\w+
See the regex demo. Details:
/
- a slash(?:[^/\s]*/)*
- zero or more repetitions of any char other than a slash and whitespace\w+
- one or more word chars.See the Python demo:
import re
rx = re.compile(r"/(?:[^/\s]*/)*\w+")
texts = ["/this/is//an example of what I want /to///do", "/this/is//an example of what i want to /do"]
for text in texts:
print( rx.sub('', text).strip() )
# => example of what I want
# example of what i want to
Upvotes: 0