Reputation: 2927
I am trying to exclude certain URLs from a match, containing /com/de/cms/
e.g.:
match this:
www.example.com/catname/all-from-category/?pageNumber=1
but not this:
example.com/com/de/cms/catname/all-from-category/?pageNumber=3
Regex:
^[^com\/de\/cms\/]+\/all-from-category\/\?pageNumber=\d(&hitsPerPage=\d)?
https://regex101.com/r/Mqpspq/1
How can I exclude URLs with com/de/cms/
while matching the other URL?
Upvotes: 0
Views: 853
Reputation: 4986
There are couple of mistakes in your regex.
The first ^
matches the start of the starting, or the start of a line if multiline mode is enabled.
The [^com\/de\/cms]
part means to match any character except c
, or o
, or m
or /
, or, etc. But your intent was to match any substring except com/de/cms
as a whole. What you want can be done using negative lookbehind, like this: (?<!com\/de\/cms\/)
You're missing the catname
part.
A working regex would be:
(?<!com\/de\/cms)\/catname\/all-from-category\/\?pageNumber=\d
The previous regex is simply says the following:
Please, match /catname/all-from-category/?pageNumber=SOME_DIGIT
that is not preceded by com/de/cms
.
Upvotes: 1