Reputation: 1145
I have the following problem.
Say I have 2 files:
A.txt
1 A1
2 A2
B.txt
1 B1
2 B2
3 B3
I want to make diff which is based only on values of first column, so the result should be
3 B3
How this problem can be solved with bash in linux?
Upvotes: 0
Views: 1496
Reputation: 21965
[ awk ] is your friend
awk 'NR==FNR{f[$1];next}{if($1 in f){next}else{print}}' A.txt B.txt
or more simply
awk 'NR==FNR{f[$1];next}!($1 in f){print}' A.txt B.txt
or even more simply
awk 'NR==FNR{f[$1];next}!($1 in f)' A.txt B.txt
A bit of explanation will certainly help
NR
& FNR
are awk built-in variables which stand for total number of records - including current - processed so far
and total number of records - including current - processed so far in the current file
respectively and they will be equal only for the first file processed.
f[$1]
creates the array f
at first and then adds $1
as a key if the same key doesn't yet exist. If no value is assigned, then f[$1] is auto-initialized to zero, but this aspect doesn't find a use in your case
next
goes to the next record with out processing rest of the awk script.
{if($1 in f){next}else{print}}
part will be processed only for the second (and subsequent if any) file/s.$1 in f
checks if the the key $1
exists in the array f
if-else-print
part is self explanatory.{print}
is omitted coz the default action for awk is printing !!Upvotes: 4
Reputation: 207688
Like this in bash
but only if you are really not interested in the second column at all:
diff <(cut -f1 -d" " A.txt) <(cut -f1 -d" " B.txt)
Upvotes: 0