Reputation: 15
I need to write a regular expression in Unix using grep, which finds lines where the double characters appear an odd number of times.
For example:
unix AA unixAA unix helpme AA //**true**, because 'AA' occurs 3 times
??red blue pink yellow red pink //**true**, because '??' occurs once
unixA unixAA unix unixAA unix //**false**, because 'AA' occurs 2 times
??red blue?? pink?? yellow?? //**false**, because '??' occurs 4 times
Thanks for help :)
Upvotes: 1
Views: 243
Reputation: 786001
This is pretty complicated regex problem. You will need gnu grep to be able to use lookaheads to solve this complex regex:
^(?:(?!(.)\1).)*((.)\3)((?:(?:(?!\2).)*\2){2})*(?:(?!\2).)*$
Using in grep
:
grep -P '^(?:(?!(.)\1).)*((.)\3)((?:(?:(?!\2).)*\2){2})*(?:(?!\2).)*$' file
unix AA unixAA unix helpme AA
??red blue pink yellow red pink
RegEx Breakup:
^ # Start
(?:(?!(.)\1).)* # Match 0+ characters that don't repeat at start
((.)\3) # Match 2 repeats of same character and capture in group #2
((?:(?:(?!\2).)*\2){2})* # match 0+ occurrence of some text followed by group #2
(?:(?!\2).)* # match anything in the end that doesn't have group #2
$ # End
Upvotes: 3