Reputation: 888
Let's say that I have a given string in javascript - e.g., var s = "{{1}}SomeText{{2}}SomeText";
It may be very long (e.g., 25,000+ chars).
NOTE: I'm using "SomeText" here as a placeholder to refer to any number of characters of plain text. In other words, "SomeText" could be any plain text string which doesn't include {{1}} or {{2}}. So the above example could be var s = "{{1}}Hi there. This is a string with one { curly bracket{{2}}Oh, very nice to meet you. I also have one } curly bracket!";
And that would be perfectly valid.
The rules for it are simple:
It does not need to have any instances of {{2}}
. However, if it does, then after that instance we cannot encounter another {{2}}
unless we find a {{1}}
first.
Valid examples:
"{{2}}SomeText"
"{{1}}SomeText{{2}}SomeText"
"{{1}}SomeText{{1}}SomeText{{2}}SomeText"
"{{1}}SomeText{{1}}SomeText{{2}}SomeText{{1}}SomeText"
"{{1}}SomeText{{1}}SomeText{{2}}SomeText{{1}}SomeText{{1}}SomeText"
"{{1}}SomeText{{1}}SomeText{{2}}SomeText{{1}}SomeText{{1}}SomeText{{2}}SomeText"
etc...
Invalid examples:
"{{2}}SomeText{{2}}SomeText"
"{{1}}SomeText{{2}}SomeText{{2}}SomeText"
"{{1}}SomeText{{2}}SomeText{{2}}SomeText{{1}}SomeText"
etc...
This seems like a relatively easy problem to solve - and indeed I could easily solve it without regular expressions, but I'm keen to learn how to do something like this with regular expressions. Unfortunately, I'm not even sure if "conditionals and lookaheads" is a correct description of the issue in this case.
NOTE: If a workable solution is presented that doesn't involve "conditionals and lookaheads" then I will edit the title.
Upvotes: 4
Views: 190
Reputation: 8042
You said you can have one instance of {2} first, right?
^(.(?!{2}))(.{2})?(?!{2})((.(?!{2})){1}(.(?!{2}))({2})?)$
Note if {2} is one letter replace all dots with [^{2}]
Upvotes: 0
Reputation: 148980
It's probably easier to invert the condition. Try to match any text that contains two consecutive instances of {{2}}
, and if it doesn't match that, it's good.
Using this strategy, your pattern can be as simple as:
/{\{2}}([^{]*){\{2}}/
This will match a literal {{2}}
, followed by zero or more characters other than {
, followed by a literal {{2}}
.
Notice that the second {
needs to be escaped, otherwise, the regex engine will consider the {2}
as to be a quantifier on the previous {
(i.e. {{2}
matches exactly two {
characters).
Just in case you need to allow characters like {
, and between the two {{2}}
, you can use a pattern like this:
/{\{2}}((?!{\{1}}).)*{\{2}}/
This will match a literal {{2}}
, followed by zero or more of any character, so long as those characters create a sequence like {{1}}
, followed by a literal {{2}}
.
Upvotes: 4
Reputation: 843
(({{1}}SomeText)+({{2}}SomeText)?)*
Broken down:
({{1}}SomeText)+ - 1 to many {{1}} instances (greedy match)
({{2}}SomeText)? - followed by an optional {{2}} instance
Then the whole thing is wrapped in ()* such that the sequence can appear 0 to many times in a row.
No conditionals or lookaheads needed.
Upvotes: 0