AsiaRafi
AsiaRafi

Reputation: 15

Odd number of double characters in a line (GREP)

I need to write a regular expression in Unix using grep, which finds lines where the double characters appear an odd number of times.

For example:

unix AA unixAA unix helpme AA //**true**, because 'AA' occurs 3 times

??red blue pink yellow red pink //**true**, because '??' occurs once

unixA unixAA unix unixAA  unix    //**false**, because 'AA' occurs 2 times

??red blue?? pink?? yellow??  //**false**, because '??' occurs 4 times

Thanks for help :)

Upvotes: 1

Views: 243

Answers (1)

anubhava
anubhava

Reputation: 786001

This is pretty complicated regex problem. You will need gnu grep to be able to use lookaheads to solve this complex regex:

^(?:(?!(.)\1).)*((.)\3)((?:(?:(?!\2).)*\2){2})*(?:(?!\2).)*$

Using in grep:

grep -P '^(?:(?!(.)\1).)*((.)\3)((?:(?:(?!\2).)*\2){2})*(?:(?!\2).)*$' file

unix AA unixAA unix helpme AA
??red blue pink yellow red pink

RegEx Demo

RegEx Breakup:

^                         # Start
(?:(?!(.)\1).)*           # Match 0+ characters that don't repeat at start
((.)\3)                   # Match 2 repeats of same character and capture in group #2
((?:(?:(?!\2).)*\2){2})*  # match 0+ occurrence of some text followed by group #2 
(?:(?!\2).)*              # match anything in the end that doesn't have group #2
$                         # End

Upvotes: 3

Related Questions