Sam
Sam

Reputation: 2341

Grepping for overlapping pattern matches

This is what I'm running

grep -o ',[tcb],' <<< "r,t,c,q,c b,b,"

The output is

,t,
,b,

But I want to get

,t,
,c,
,b,

(I do not want the b without a preceding , or the c without a trailing , to be matched)

Because ,[tcb], should be found in 'r",t,"c,q b,b,' 'r,t",c,"q b,b,' and 'r,t,c,q b",b,"'

But it seems that when the , is included in the first pattern match then grep does not look for this in the second instance of the pattern match

Is there a way around this or is grep not meant to do this

Upvotes: 3

Views: 561

Answers (3)

Thomas B Preusser
Thomas B Preusser

Reputation: 1189

You can use grep with a Perl RE, which allows non-capturing look-behind and look-ahead patterns to extract letters surrounded by commas. You can then restore the separators just as you need them as by:

grep -o -P '(?<=,)[tcb](?=,)' <<< "r,t,c,q,c b,b,"|while read c; do echo ",$c,"; done

Upvotes: 2

Jean-Fran&#231;ois Fabre
Jean-Fran&#231;ois Fabre

Reputation: 140256

The awk solution is nice. I have another with sed+grep:

echo  "r,t,c,q,c b,b," | sed "s/,/,,/g" | grep -o ',[tcb],'

,t,
,c,
,b,

Upvotes: 1

anubhava
anubhava

Reputation: 785471

You can use awk instead of grep for this with record separator as comma:

awk -v RS=, '/^[tcb]$/{print RS $0 RS}' <<< "r,t,c,q,c b,b,"

,t,
,c,
,b,

Upvotes: 3

Related Questions