Redson
Redson

Reputation: 2140

find difference and similarities between two text files using awk

I have two files:

file 1

1
2
34:rt
4

file 2

1
2
34:rt
7

I want to display rows that are in file 2 but not in file 1, vice versa, and the same values in both text files. So file the expected result should look like:

1 in both
2 in both
34:rt in both
4 in file 1
7 in file 2

This is what I have so far but I am not sure if this is the right structure:

 awk '    
    FNR == NR {      
        a[$0]++;
        next;           
    }    

    !($0 in a) {                          
        // print not in file 1
    }



    ($0 in a) {                         

        for (i = 0; i <= NR; i++) {
            if (a[i] == $0) {
                // print same in both
            } 
        }

        delete a[$0]  # deletes entries which are processed
    }

    END {                               
        for (rest in a) {                
            // print not in file 2
        }
    }' $PWD/file1 $PWD/file2

Any suggestions?

Upvotes: 1

Views: 219

Answers (1)

jaypal singh
jaypal singh

Reputation: 77085

If the order is not relevant then you can do:

awk '
NR==FNR { a[$0]++; next }
{
    print $0, ($0 in a ? "in both" : "in file2");
    delete a[$0]
}
END {
    for(x in a) print x, "in file1"
}' file1 file2
1 in both
2 in both
34:rt in both
7 in file2
4 in file1

Or using comm as suggested by choroba in comments:

comm --output-delimiter="|" file1 file2 | 
awk -F'|' '{print (NF==3 ? $NF " in both" : NF==2 ? $NF "in file2" : $NF " in file1")}'
1 in both
2 in both
34:rt in both
4 in file1
7 in file2

Upvotes: 1

Related Questions