user4840141
user4840141

Reputation:

Awk multiple columns in 2 files and output the matching lines

I have a 2 input files, as follows x.txt

C20775336       maker   gene    1895    2166    .       -       .       ID=gene1;Name=maker-C20775336-augustus-gene-0.0
C20775336       maker   gene    3097    4624    .       -       .       Parent=mRNA1

file 2 y.txt

scaffold4557    hsal_OGSv3.3    gene    3097    4624    74.8    +       .       ID=HSAL10661-RA;Parent=HSAL10661;Name=HSAL10661-RA;Alias=Hsal_17580--XP_001599845.1_NASVI
C20775336       maker   gene     1895    1962    .       -       2       ID=CDS1;Parent=mRNA1

I would like to compare column 4 in both the files and column 5 in both the files, if it satisfies both the conditions , then print that line from file 2 In the above case..output should be as follows:

scaffold4557    hsal_OGSv3.3    gene    3097    4624    74.8    +       .       ID=HSAL10661-RA;Parent=HSAL10661;Name=HSAL10661-RA;Alias=Hsal_17580--XP_001599845.1_NASVI

I tried using awk, but was not successful. thanks in advance

Upvotes: 1

Views: 592

Answers (1)

John1024
John1024

Reputation: 113964

$ awk 'FNR==NR{seen[$4,$5]=1;next} ($4,$5) in seen' x.txt y.txt 
scaffold4557    hsal_OGSv3.3    gene    3097    4624    74.8    +       .       ID=HSAL10661-RA;Parent=HSAL10661;Name=HSAL10661-RA;Alias=Hsal_17580--XP_001599845.1_NASVI

Upvotes: 3

Related Questions