k0d3r1s
k0d3r1s

Reputation: 36

Regular expression for not more than one occurance of consecutive characters

I'm looking for regular expression that will match only if 2 consecutive characters occur in string once.

for example:

currently have this regex: ([0-9])\1{1,} but it matches 1122345 as well which is not what i need

Upvotes: 0

Views: 200

Answers (1)

Quasímodo
Quasímodo

Reputation: 4004

This awk does it, if you have minimal awk (mawk) or GNU awk (gawk):

awk -F "" '
{
    d=0
    for(i=1;i<NF;i++){
        if ($i==$(i+1)) d++
    }
    if (d==1) print
}' file

Setting the field to empty string ("") you can read each line character-wise! If character i equals character i+1, then increment d. If d==1, the string is printed.

From your sample:

$ cat file
1123456
1122345
1121125
1234567
1112345

It outputs:

1123456

Important remark:

GNU awk manual says the use of empty string as field separator is a "dark corner", meaning that it is not standard and some implementations may handle it differently. If you want to be sure that it will work with any awk, go for

awk '
{
    d=0
    n=split($0,ch,"")
    for(i=1;i<n;i++){
        if (ch[i]==ch[i+1]) d++
    }
    if (d==1) print
}' file

It passed the gawk --posix test and yields the same result.

Upvotes: 1

Related Questions