Reputation: 311
I'm stuck on making this Regex. I tried using look-ahead and look-behind together, but I couldn't use the capture group in the look-behind. I need to extract characters from a string ONLY if it occurs 4 times.
If I have these strings
The first one will match because it has 4 A's in a row. The second one will NOT match because it has 6 B's in a row. The third one will match because it still has 4 A's. What makes it even more frustrating, is that it can be any char from A to Z occuring 4 times.
Positioning does not matter.
EDIT: My attempt at the regex, doesn't work.
(([A-Z])\2\2\2)(?<!\2*)(?!\2*)
Upvotes: 3
Views: 1058
Reputation: 370729
If lookbehind is allowed, after capturing the character, negative lookbehind for \1.
(because if that matches, the start of the match is preceded by the same character as the captured first character). Then backreference the group 3 times, and negative lookahead for the \1
:
`3346AAAA44
3973BBBBBB44
9755BBBBBBAAAA44`
.split('\n')
.forEach((str) => {
console.log(str.match(/([a-z])(?<!\1.)\1{3}(?!\1)/i));
});
([a-z])
- Capture a character(?<!\1.)
Negative lookbehind: check that the position at the 1st index of the captured group is not preceded by 2 of the same characters\1{3}
- Match the same character that was captured 3 more times(?!\1)
- After the 4th match, make sure it's not followed by the same characterUpvotes: 3
Reputation: 163287
Another variant could be capturing the first char in a group 1.
Assert that then the previous 2 chars on the left are not the same as group 1, match an additional 3 times group 1 which is a total of 4 the same chars.
Then assert what is on the right is not group 1.
([A-Z])(?<!\1\1)\1{3}(?!\1)
([A-Z])
Capture group 1, match a single char A-Z(?<!\1\1)
Negative lookbehind, assert what is on the left is not 2 times group 1\1{3}
Match 3 times group 1(?!\1)
Assert what is on the right is not group 1For example
let pattern = /([A-Z])(?<!\1\1)\1{3}(?!\1)/g;
[
"3346AAAA44",
"3973BBBBBB44",
"9755BBBBBBAAAA44",
"AAAA",
"AAAAB",
"BAAAAB"
].forEach(s =>
console.log(s + " --> " + s.match(pattern))
);
Upvotes: 0
Reputation: 1403
Another version without lookbehind (see demo). The captured sequence of 4 equal characters will be rendered in Group 2.
(?:^|(?:(?=(\w)(?!\1))).)(([A-Z])\3{3})(?:(?!\3)|$)
(?:^|(?:(?=(\w)(?!\1))).)
- ensure it's the beginning of the string. Otherwise, the 2nd char must be different from the 1st one - if yes, skip the 1st char.(([A-Z])\3{3})
Capture 4 repeated [A-Z]
chars(?:(?!\3)|$)
- ensure the first char after those 4 is different. Or it's the end of the stringAs it was suggested by bobble-bubble in this comment - the expression above can be simplified to (demo):
(?:^|(\w)(?!\1))(([A-Z])\3{3})(?!\3)
Upvotes: 2