Reputation: 167
I have the following path stored as a python string 'C:\ABC\DEF\GHI\App\Module\feature\src'
and I would like to extract the word Module
that is located between words \App\
and \feature\
in the path name. Note that there are file separators '\'
in between which ought not to be extracted, but only the string Module
has to be extracted.
I had the few ideas on how to do it:
\App\
and \feature\
\App\
--> App\\[A-Za-z0-9]*\\
, and then split that matched string in order to find the Module
.I think the 1st solution is better, but that unfortunately it goes over my RegEx knowledge and I am not sure how to do it.
I would much appreciate any help.
Thank you in advance!
Upvotes: 0
Views: 76
Reputation: 1574
The regex you want is:
(?<=\\App\\).*?(?=\\feature\\)
Explanation of the regex:
(?<=behind)rest
matches all instances of rest
if there is behind
immediately before it. It's called a positive lookbehindrest(?=ahead)
matches all instances of rest
where there is ahead
immediately after it. This is a positive lookahead.\
is a reserved character in regex patterns, so to use them as part of the pattern itself, we have to escape it; hence, \\
.*
matches any character, zero or more times.?
specifies that the match is not greedy (so we are implicitly assuming here that \feature\
only shows up once after \App\
).\
characters between \App\
and \feature\
.The full code would be something like:
str = 'C:\\ABC\\DEF\\GHI\\App\\Module\\feature\\src'
start = '\\App\\'
end = '\\feature\\'
pattern = rf"(?<=\{start}\).*?(?=\{end}\)"
print(pattern) # (?<=\\App\\).*?(?=\\feature\\)
print(re.search(pattern, str)[0]) # Module
A link on regex lookarounds that may be helpful: https://www.regular-expressions.info/lookaround.html
Upvotes: 3
Reputation: 369
Your are looking for groups. With some small modificatians you can extract only the part between App and Feature.
(?:App\\\\)([A-Za-z0-9]*)(?:\\\\feature)
The brackets (
)
define a Match group which you can get by match.group(1)
. Using (?:foo)
defines a non-matching group, e.g. one that is not included in your result. Try the expression here: https://regex101.com/r/24mkLO/1
Upvotes: 2
Reputation: 4062
We can do that by str.find
somethings like
str = 'C:\\ABC\\DEF\\GHI\\App\\Module\\feature\\src'
import re
start = '\\App\\'
end = '\\feature\\'
print( (str[str.find(start)+len(start):str.rfind(end)]))
print("\n")
output
Module
Upvotes: 2