How to print the lines that contains certain strings by order?

Question

I have two files

file indv

COPDGene_P51515
COPDGene_V67803
COPDGene_Z75868
COPDGene_U48329
COPDGene_R08908
COPDGene_E34944

file data

    COPDGene_Z75868  1   
    COPDGene_A12318  3
    COPDGene_R08908  5
    COPDGene_P51515  8
    COPDGene_U48329  2
    COPDGene_V67803  8
    COPDGene_E34944  2
    COPDGene_D29835  9

I want to print the lines that contains the strings in the indv by the order of indv like following

COPDGene_P51515  8
COPDGene_V67803  8
COPDGene_Z75868  1
COPDGene_U48329  2
COPDGene_R08908  5
COPDGene_E34944  2

I tried to use

awk 'NR==FNR{a[$1]++;next} ($1 in a)' indv data

But I got

        COPDGene_Z75868  1   
        COPDGene_R08908  5
        COPDGene_P51515  8
        COPDGene_U48329  2
        COPDGene_V67803  8
        COPDGene_E34944  2

which is not the order of indv.

John1024 · Accepted Answer

$ awk 'FNR==NR{a[$1]=$0;next;} {print a[$1]}' data indv
COPDGene_P51515  8
COPDGene_V67803  8
COPDGene_Z75868  1
COPDGene_U48329  2
COPDGene_R08908  5
COPDGene_E34944  2

How it works

FNR==NR{a[$1]=$0;next;}

For the first file read, data, save each line in associative array a under the index of its first field, $1. Skip the rest of the commands and start over on the next line.
print a[$1]

If we get here, we are working on the second file, indv. For this file, print each line from data that corresponds to the first field on this line. In this way, the contents of each line is controlled by data but the order of printing is controlled by indv.

How to print the lines that contains certain strings by order?

Answers (2)

How it works

Related Questions