Reputation: 36
I'm looking for regular expression that will match only if 2 consecutive characters occur in string once.
for example:
currently have this regex: ([0-9])\1{1,} but it matches 1122345 as well which is not what i need
Upvotes: 0
Views: 200
Reputation: 4004
This awk does it, if you have minimal awk (mawk
) or GNU awk (gawk
):
awk -F "" '
{
d=0
for(i=1;i<NF;i++){
if ($i==$(i+1)) d++
}
if (d==1) print
}' file
Setting the field to empty string (""
) you can read each line character-wise! If character i
equals character i+1
, then increment d
. If d==1
, the string is printed.
From your sample:
$ cat file
1123456
1122345
1121125
1234567
1112345
It outputs:
1123456
Important remark:
GNU awk manual says the use of empty string as field separator is a "dark corner", meaning that it is not standard and some implementations may handle it differently. If you want to be sure that it will work with any awk, go for
awk '
{
d=0
n=split($0,ch,"")
for(i=1;i<n;i++){
if (ch[i]==ch[i+1]) d++
}
if (d==1) print
}' file
It passed the gawk --posix
test and yields the same result.
Upvotes: 1