Reputation: 75
lets say we have the following data
B346879 length: 12 B34 L677
B111879 length: 32 B33 L677
B545879 length: 34 B34 L677
B344879 length: 98 B33 L677
B090879 length: 45 B33 L677
i'm looking for 'B34' so from this series I would want to print the 1st and 3rd line. But if I would use:
cat t.txt | sed -n '/B34/p' | awk '{print $1", "$4" "$5}' | sed 's/B//g'
The 4th line would also be printed because 'B34' matches the first 3 chars of 'B344879. I know that with something like sed 's/pattern/replacement/n'
you can replace only the nth occurance of a regex. But how does it works with printing. I tried stuff like sed -n '/B34/2p'
but that's invalid.
Upvotes: 4
Views: 2245
Reputation: 58371
This might work for you (GNU sed):
sed -rn 's/^\s*(\S+)\s+(\S+\s+){2}(B34)\s+(\S+)/\1, \3 \4/;T;s/B//g;p' file
Upvotes: 2
Reputation: 70722
You can use the word boundary notation \< ..\>
here.
cat t.txt | sed -n '/\<B34\>/p' | awk '{print $1", "$4" "$5}' | sed 's/B//g'
Output
346879, 34 L677
545879, 34 L677
Print the matching lines.
sed -n '/\<B34\>/p' t.txt
Output
B346879 length: 12 B34 L677
B545879 length: 34 B34 L677
Upvotes: 1
Reputation: 195029
try this line:
awk '$4=="B34"' file
your rest commands (sed, cat...) could be merged into the above one-liner:
awk '$4=="B34"{gsub(/B/,"");print $1,$4,$5}' file
test it with your example:
kent$ echo " B346879 length: 12 B34 L677
B111879 length: 32 B33 L677
B545879 length: 34 B34 L677
B344879 length: 98 B33 L677
B090879 length: 45 B33 L677"|awk '$4=="B34"{gsub(/B/,"");print $1,$4,$5}'
346879 34 L677
545879 34 L677
EDIT
awk takes the space as FS
, so it doesn't matter how long your 3rd field is, for example:
kent$ echo " B346879 length: 17777777777777777772 B34 L677
B111879 length: 32 B33 L677
B545879 length: 34 B34 L677
B344879 length: 98 B33 L677
B090879 length: 45 B33 L677"|awk '$4=="B34"{gsub(/B/,"");print $1,$4,$5}'
346879 34 L677
545879 34 L677
EDIT
ok, see what you mean, so this should work:
awk -F'length:[ 0-9]*' '$2~/^B34/{sub(/B/,"",$1);sub(/B/,"",$2);print $1,$2}'
see test: the first line is in special case.
kent$ echo " B346879 length:212 B34 L677
B111879 length: 32 B33 L677
B545879 length: 34 B34 L677
B344879 length: 98 B33 L677
B090879 length: 45 B33 L677"|awk -F'length:[ 0-9]*' '$2~/^B34/{sub(/B/,"",$1);sub(/B/,"",$2);print $1,$2}'
346879 34 L677
545879 34 L677
Upvotes: 1