zychen
zychen

Reputation: 1

awk pipe matched contents

I have use awk、grep use pipe get the contents we called A(part of contents in my file):

LOC_Os04g47290
LOC_Os04g53190,LOC_Os04g53195
LOC_Os09g20260

I want to use the contents to grep or get matched contents and others in B(part of contents in my file):

_O2 int381,int382,int384,int385,int386,int387,int388,int391,int392,int393,int394,int395,int396,int397,int398,int399,int400,int401,int402,int403,int404,int408,int409,int410,int412,int413,int414:chr4:31119012..31944575    chr4:31669055..31674598 LOC_Os04g53190,LOC_Os04g53195   CPuORF12,expressed - conserved peptide uORF-containing transcript, expressed ; protein ;        PF01593 Amino_oxidase   0.0539946

when I use

cat a|awk -F"," '{for (i=1;i<=NF;i++)print $i}'|grep -f - B|grep PF|awk '{print $4"\t"$(NF-2)}'

i will get

LOC_Os04g53190,LOC_Os04g53195   PF01593

But, i want to print

  LOC_Os04g53190 PF01593
  LOC_Os04g53195 PF01593

Upvotes: 0

Views: 171

Answers (2)

Jose Ricardo Bustos M.
Jose Ricardo Bustos M.

Reputation: 8164

Improving awk last statement

cat a | 
awk -F"," '{for (i=1;i<=NF;i++)print $i}' | 
grep -f - B | 
grep PF | 
awk '{n=split($4,v,","); for(i=1; i<=n; ++i) print v[i]"\t"$(NF-2)}'

you get,

LOC_Os04g53190  PF01593
LOC_Os04g53195  PF01593

bonus: awk only solution

awk '
    NR==FNR{d[$1]; next}
    $(NF-2) ~ /^PF/{
        n=split($4,v,",")
        for(i=1; i<=n; ++i) if(v[i] in d) print v[i]"\t"$(NF-2)
    }
' RS="[\n,]" a RS="\n" B

Upvotes: 1

Sharad
Sharad

Reputation: 10592

Sample file

sharad$ cat sample_file
foo
bar
sharad$ 

Capture matching contents into a variable

sharad$ match=$(cat sample_file | grep foo)

Capture non-matching contents into another variable

sharad$ non_match=$(cat sample_file | grep -v foo)
sharad$ 

Verify the contents of matching and non-matching variables (grep -v)

sharad$ echo $match
foo
sharad$ echo $non_match
bar
sharad$

From man grep

-v, --invert-match Selected lines are those not matching any of the specified patterns.

Upvotes: 0

Related Questions