Reputation: 5066
I have two files, smaller and bigger and bigger contains all lines of smaller. Those lines are almost same, just last column differs.
file_smaller
A NM 0
B GT 4
file_bigger
A NM 5 <-same as in file_smaller according to my rules
C TY 2
D OP 6
B GT 3 <-same as in file_smaller according to my rules
I would like to write lines, where the two files differ, that means:
wished_output
C TY 2
D OP 6
Could you please help me to do so? Thanks a lot.
Upvotes: 1
Views: 258
Reputation: 247012
grep -vf <(cut -d " " -f 1-2 file_smaller| sed 's/^/^/') file_bigger
The process substitution results in this:
^A NM
^B GT
Then, grep -v
removes those patterns from "file_bigger"
Upvotes: 1
Reputation: 6607
Bash 4 using associative arrays:
#!/usr/bin/env bash
f() {
if (( $# != 2 )); then
echo "usage: ${FUNCNAME} <smaller> <bigger>" >&2
return 1
fi
local -A smaller
local -a x
while read -ra x; do
smaller["${x[@]::2}"]=0
done <"$1"
while read -ra x; do
((${smaller["${x[@]::2}"]:-1})) && echo "${x[*]}"
done <"$2"
}
f /dev/fd/3 /dev/fd/0 <<"SMALLER" 3<&0 <<"BIGGER"
A NM 0
B GT 4
SMALLER
A NM 5
C TY 2
D OP 6
B GT 3
BIGGER
Upvotes: 0
Reputation: 16389
awk 'FILENAME==file_bigger {arr[$1 $2]=$0}
FILENAME==file_smaller { tmp=$1 $2; if( tmp in arr) {next} else {print $0}}
' file_bigger file_smaller
See if that meets you needs
Upvotes: 1
Reputation: 2117
you can do the following:
cat file_bigger file_smaller |sed 's=\(.*\).$=\1='|sort| uniq -u > temp_pat
grep -f temp_pat file_bigger ; rm temp_pat
which will (in the same order)
all in all, the expected result.
Upvotes: 2