Reputation: 2725
I need a regex to match expressions which contain the string OKAY then a possible hyphen, and then zero or one word characters. after this any non-word-character is accepted and then anything. for expressions which match, OKAY will be changed to OK if there is no word-character following, and to e.g: OA if the letter following is A. if the hyphen exists it is dropped.
OKAY => OK
OKAY- => OK
OKAYA => OA
OKAY-A => OA
OKAYAB => OKAYAB (no-match)
OKAY-AB => OKAY-AB (no-match)
examples may be followed by e.g: .CD without changing the results
OKAY.CD => OK.CD
OKAY-.CD => OK.CD
OKAYA.CD => OA.CD
OKAY-A.CD => OA.CD
OKAYAB.CD => OKAYAB.CD (no-match)
OKAY-AB.CD => OKAY-AB.CD (no-match)
my problem implementing this was that since both the hyphen and the word-character are optional, I get "lazy" matches which match also the non-wanted cases. for the sake of education I would appreciate examples both with and without look-aheads (if possible).
Upvotes: 0
Views: 225
Reputation: 208405
Here is a regex that should work for you:
\bOKAY(?>-?)(\w)?([^\w\s]\S*)?(?!\S)
Since it isn't clear what language you are using, here is pseudo code for how you would do the replacement.
"O" + (match.group(1) if match.group(1) else "K") + match.group(2)
Here is a rubular: http://www.rubular.com/r/SE8MBkUUUo
edit: I made some changes in the above regex after the comments, but the description below does not reflect those changes. Here are the changes from the original regex:
^
to \b
so it doesn't need to start at beginning of line\W
became [^\w\s]
, this prevents OKAY OKAY
from being one match.*
to \S*
so the match will end at whitespace$
to (?!\S)
, (?!\S)
means "only match if we are at the end of the string or the next character is whitespace", could also be written as (?=\s|\z)
The really tricky part here is that a regex like ^OKAY-?(\w)?(\W.*)?$
looks like it would work, but it does not for a case like OKAY-AB
because in the end both the -?
and the (\w)?
will not match, and then (\W.*)?
will match the remainder of the string.
What we need to do to fix this is make it so -?
will not backtrack. This would be simple if possessive quantifiers were supported by .NET, then we could just change it to -?+
.
Unfortunately they aren't supported, so we need to use atomic grouping instead. (?>-?)
will optionally match a -
, but will forget all backtracking information as soon as it exits the group. Note that the atomic group does not capture, so (\w)?
is capture group 1.
Upvotes: 2
Reputation: 20242
To do this without lookaheads, you can use
^(OKAY)(((-\w?|\w)(\W.*)?)|[^-\w].*)?$
This matches the word "OKAY" and then an optional group containing either a -
, an optional word character, and then an optional non-word-character followed by anything group, or a character that is not a -
or a word character followed by anything. The ^
and $
match the start and end of the string respectively, so it will only match exactly the acceptable strings.
Lookaheads would barely make a difference. The only change would be to put a lookahead ((?=...)
) around everything after the "OKAY" group.
To use this with .net, the only change needed would be to escape all of the \
in the string.
Upvotes: 1
Reputation: 3125
Don't know .NET regex, but this is a start with preg-style matching:
OKAY-?(\w?)([^\w-]\w+)?\s*$
If $1 is empty, then output is OK$2
Otherwise, output is O$1$2.
Upvotes: 1