van Ness
van Ness

Reputation: 75

sed print line with nth occurrence of regex

lets say we have the following data

 B346879 length: 12 B34 L677
 B111879 length: 32 B33 L677
 B545879 length: 34 B34 L677
 B344879 length: 98 B33 L677
 B090879 length: 45 B33 L677

i'm looking for 'B34' so from this series I would want to print the 1st and 3rd line. But if I would use:

cat t.txt | sed -n '/B34/p' | awk '{print $1", "$4" "$5}' | sed 's/B//g'

The 4th line would also be printed because 'B34' matches the first 3 chars of 'B344879. I know that with something like sed 's/pattern/replacement/n' you can replace only the nth occurance of a regex. But how does it works with printing. I tried stuff like sed -n '/B34/2p' but that's invalid.

Upvotes: 4

Views: 2245

Answers (4)

potong
potong

Reputation: 58371

This might work for you (GNU sed):

sed -rn 's/^\s*(\S+)\s+(\S+\s+){2}(B34)\s+(\S+)/\1, \3 \4/;T;s/B//g;p' file

Upvotes: 2

hwnd
hwnd

Reputation: 70722

You can use the word boundary notation \< ..\> here.

cat t.txt | sed -n '/\<B34\>/p' | awk '{print $1", "$4" "$5}' | sed 's/B//g'

Output

346879, 34 L677
545879, 34 L677

Print the matching lines.

sed -n '/\<B34\>/p' t.txt

Output

B346879 length: 12 B34 L677
B545879 length: 34 B34 L677

Upvotes: 1

Kent
Kent

Reputation: 195029

try this line:

awk '$4=="B34"' file

your rest commands (sed, cat...) could be merged into the above one-liner:

awk '$4=="B34"{gsub(/B/,"");print $1,$4,$5}' file

test it with your example:

kent$  echo " B346879 length: 12 B34 L677
 B111879 length: 32 B33 L677
 B545879 length: 34 B34 L677
 B344879 length: 98 B33 L677
 B090879 length: 45 B33 L677"|awk '$4=="B34"{gsub(/B/,"");print $1,$4,$5}' 
346879 34 L677
545879 34 L677

EDIT

awk takes the space as FS, so it doesn't matter how long your 3rd field is, for example:

kent$  echo " B346879 length: 17777777777777777772 B34 L677                                                                                                                 
 B111879 length: 32 B33 L677
 B545879 length: 34 B34 L677
 B344879 length: 98 B33 L677
 B090879 length: 45 B33 L677"|awk '$4=="B34"{gsub(/B/,"");print $1,$4,$5}' 
346879 34 L677
545879 34 L677

EDIT

ok, see what you mean, so this should work:

awk -F'length:[ 0-9]*' '$2~/^B34/{sub(/B/,"",$1);sub(/B/,"",$2);print $1,$2}' 

see test: the first line is in special case.

kent$  echo " B346879 length:212 B34 L677
 B111879 length: 32 B33 L677
 B545879 length: 34 B34 L677
 B344879 length: 98 B33 L677
 B090879 length: 45 B33 L677"|awk -F'length:[ 0-9]*' '$2~/^B34/{sub(/B/,"",$1);sub(/B/,"",$2);print $1,$2}'
 346879  34 L677
 545879  34 L677

Upvotes: 1

piokuc
piokuc

Reputation: 26164

cat t.txt | awk '$4 == "B34" {print $1", "$4" "$5}'|sed s/B//g

Upvotes: 1

Related Questions