Reputation: 301
I am trying to build a regex that should match for example.
b
abab
babab
but not
bb
babb
aaaba
abaaba
At the moment I have a(b)|b(a)
and it is working for abab
. I am missing the first and last letter, for example b
or babab
.
So I need to specify a
alone or b
alone or a letter at the end of a word (if the letter before it isn't itself). But I can't figure out how to do that.
I am using http://www.rexv.org/ (Perl PCRE) to try it.
Thanks guys but i forgot to mention: An empty string can also be matched, and i can only use the following
* ? +
|
()
.
Thanks guys!,
i suppose it isn't possible without being able to specify beginning and end of the string to work correctly at http://www.rexv.org/
Upvotes: 2
Views: 181
Reputation: 213281
Edit: -
If you don't want to use look-ahead
and look-behind
assertion, you can use this regex: -
"b?(ab)*|a?(ba)*" // Will also match `empty string`
Explanation : -
b? // 0 or 1 b
( // capture group.
ab // Match ab
)* // group close `0 or more repetition
|
a?(ba)* // Same with `a` replaced with `b`, and `b` with `a`
Old Answer : -
Use this regex: -
"((?<!a)a|(?<!b)b)*" // This will also match empty string
It matches a
not preceded by another a
. Same for b
.
( // Capture group
(?<! // Negative Look-behind assertion
a // on a
)
a // Match a
| // or
(?<! // Negative Look-behind assertion
b // on b
)
b // Match b
) // Close capture group
+ // 1 or more repetition
Upvotes: 0
Reputation: 34367
Instead of building complex match regex, use the simple regex to match the repeating chars and use the opposite as:
String stringToMatch = "babaab";
Pattern p1 = Pattern.compile("^[ab]+$");//match the a`s and b`s kind of string
Pattern p2 = Pattern.compile("([ab])\\1+");//match the repeating a`s and b`s
Matcher m1 = p1.matcher(stringToMatch);
Matcher m2 = p2.matcher(stringToMatch);
if (m1.find() && !m2.find()){//validates it has a's and b's but not repeating
//valid string
}
To match any word character, simply use: (\\w)\\1+
. This is the best part of it. Simple and extendable to cover more set of characters e.g. abcdabcd etc.
Upvotes: 0
Reputation: 16296
Try this one:
^((b?(ab)*a?)|(a?(ba)*b?))$
This assumes your alphabet is limited to {a, b}
.
Upvotes: 0
Reputation: 38446
Try something like this:
^((?:(?:ab)*a?)|(?:(?:ba)*b?))$
Explained:
^( # beginning of the string
(?:
(?:ab)* # matches any repeating `ab` group
a? # group can optionally end with an `a`
)
|
(?:
(?:ba)* # matches any repeating `ba` group
b? # group can optionally end with a `b`
)
)$ # end of the string
I include the subgroups as non-capturing with the leading (?:
using a full-capturing group around the entire regex. This will make sure to return you only the full-strings that match instead of the noise of each sub-group.
The caveat to this approach is that an "empty" string will also match.
UPDATE (limited set of characters)
Your limited-set of characters will still work with my pattern above, however, we'll need to drop the non-matching group portion (?:
). The regex will end up as:
(((ab)*a?)|((ba)*b?))
The caveat mentioned above is that it will also match an empty string, however, this appears to be what you need so we can add that to the bonus-list!
A slight issue with the set of characters you're allowed to use is that you aren't allowed to use the ^
and $
characters which indicate the start and end of a string, respectively. The problem with this is that, any sub-pattern that is matched (regardless of the regex you use) will flag the input as valid. I'm assuming that this is accounted for though.
Upvotes: 2