crystal
crystal

Reputation: 37

Compare two files with different numbers, but with same text

I got two files with a number and a path to a file. The path is same in both files, but the number can be different.

--- File 1 ---
1198464 ./aaa/file_A
   6345 ./bbb/file_B
  24345 ./bbb/file_C
2345212 ./ccc/file_D
  92315 ./ddd/file_E
 452217 ./ddd/file_F


--- File 2 ---
1198464 ./aaa/file_A
   1234 ./bbb/file_B
    340 ./bbb/file_C
 452217 ./ddd/file_F

My goal is to print out the number and path IF the path are in BOTH files and the NUMBER is different. In this case, the NUMBER from File 1 should be printed out. Expected output is:

   6345 ./bbb/file_B
  24345 ./bbb/file_C

My best shot is the following command:

awk 'FNR==NR {lines[$2]; next} ($2 in lines) && ($1 not in lines)'  File2 File1

But "($1 not in lines)" doesn't work. How can I make it work?

Upvotes: 1

Views: 98

Answers (3)

Daweo
Daweo

Reputation: 36430

But "($1 not in lines)" doesn't work. How can I make it work? ($1 not in lines) is not negation of ($1 in lines). not is seen as non-set variable and therefore it work like empty string, consider that

BEGIN{print "1" not "0"}

output:

10

Therefore ($1 not in lines) is turned into ($1 "" in lines) which is same as ($1 in lines). If you want to check if something is not in array you should do !(x in arr), for example:

BEGIN{arr[1]="a";arr[3]="b";arr[5]="c"}
END{for(x=1;x<=5;x+=1){print x " -> " !(x in arr)}}

output:

1 -> 0
2 -> 1
3 -> 0
4 -> 1
5 -> 0

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203491

awk '
    { fname=$0; sub(/[[:space:]]*[^[:space:]]+[[:space:]]+/,"",fname) }
    NR==FNR { map[fname] = $1; next }
    (fname in map) && (map[fname] != $1)
' file1 file2

Upvotes: 2

anubhava
anubhava

Reputation: 785128

You may try this awk:

awk 'FNR==NR {map[$2] = $1; next} $2 in map && map[$2] != $1' file2 file1

   6345 ./bbb/file_B
  24345 ./bbb/file_C

To make it more readable:

awk 'FNR == NR {
   map[$2] = $1
   next
}
$2 in map && map[$2] != $1' file2 file1

Upvotes: 1

Related Questions