Reputation: 5059
I am not sure what wrong I am doing but I am certainly making some mistake with my awk command.
I have two files, fileA contains names
FileA
Abhi
Roma
GiGi
KaKa
FileB contains other data with names
Abhi 23 Pk
DaDa 43 Gk
Roma 33 Kk
PkPk 22 Aa
Now, I trying to print the details of all the names that are absent in fileA.
for i in `cat FileA` ; do cat FileB | awk '{ if ($1!='$i') print $0_}'>> Result; done
What I get is
Abhi 23 Pk
DaDa 43 Gk
Roma 33 Kk
PkPk 22 Aa
Abhi 23 Pk
DaDa 43 Gk
Roma 33 Kk
PkPk 22 Aa
Abhi 23 Pk
DaDa 43 Gk
Desired output
DaDa 43 Gk
PkPk 22 Aa
Could anyone help me in finding out the error.
Thank you
Upvotes: 3
Views: 63138
Reputation: 2865
mawk 'NR==FNR ? __[$_] : $!_ in __==_' <( printf '%s' "$test1" ) <( printf '%s' "$test2" )
DaDa 43 Gk
PkPk 22 Aa
or make it without the ternary operator :
gawk '$!_ in __ != (FNR < NR || __[$_])'
DaDa 43 Gk
PkPk 22 Aa
Upvotes: 0
Reputation: 86
this task looks like classical Two-file processing pattern:
# prints lines that are not both in fileA & fileB (inv intersection)
$ awk 'NR == FNR{a[$1];next} !($1 in a) ' fileA fileB
so here:
NR==FNR
is True only when reading 1st filea[$1]
- create element with 1st column from fileA as key
a[$0] is same in this example, as $0==$1
one could write ++a[$1]
to count duplicates if needed same time.
or a[$1]=$2
to store some extra infonext
- stops further processing, while reading 1st file, e.g. FileA!($1 in a)
- this part will start being executed while reading FileB
and it will print only lines from it when a[$1] exists,
e.g. there is element with key equal to $1.
Note, its equivalent to !($1 in a) {print $0}
,
so printing format could be modified if desired...Upvotes: 1
Reputation: 670
The problem is that when you want to compare with a string, that string must be between quotes, otherwise, it assumes that the string is a variable name.
For example:
awk '{ if ($1!=name) print $0_}'
In this case, awk will assume that "name" is a variable, which will be empty, as no value has been assigned to it, and hence, compare $1 with an empty string.
awk '{ if ($1!="name") print $0_}'
In this case, awk will compare $1 with the string "name".
Therefore, the correct code for you is:
for i in `cat FileA` ; do cat FileB | awk -v var="$i" '{ if ($1!=var) print $0_}'>> Result; done
This will also work, though I think it is clearer in the previous way:
for i in `cat FileA` ; do cat FileB | awk '{ if ($1!="'$i'") print $0_}'>> Result; done
EDIT: Check fedorqui answer for a better approach in the solution
Upvotes: 3
Reputation: 290025
For this you just need grep
:
$ grep -vf fileA fileB
DaDa 43 Gk
PkPk 22 Aa
This uses fileA
to obtain the patterns from. Then, -v
inverts the match.
AwkMan addresses very well why you are not matching lines properly. Now, let's see where your solution needs polishing:
Your code is:
for i in `cat FileA`
do
cat FileB | awk '{ if ($1!='$i') print $0_}'>> Result
done
Why you don't read lines with "for" explains it well. So you would need to say something like the described in Read a file line by line assigning the value to a variable:
while IFS= read -r line
do
cat FileB | awk '{ if ($1!='$i') print $0_}'>> Result
done < fileA
Then, you are saying cat file | awk '...'
. For this, awk '...' file
is enough:
while IFS= read -r line
do
awk '{ if ($1!='$i') print $0_}' FileB >> Result
done < fileA
Also, the redirection could be done at the end of the done
, so you have a clearer command:
while IFS= read -r line
do
awk '{ if ($1!='$i') print $0_}' FileB
done < fileA >> Result
Calling awk
so many times is not useful and you can use the FNR==NR
trick to process two files together.
Let's now enter in awk
. Here you want to use some kind of variable to compare results. However, $i
is nothing to awk
.
Also, when you have a sentence like:
awk '{if (condition) print $0}' file
It is the same to say:
awk 'condition' file
Because {print $0}
is the default action to perform when a condition evaluates to true.
Also, to let awk
use a bash variable you need to use awk -v var="$shell_var"
and then use var
internally-
All together, you should say something like:
while IFS= read -r line
do
awk -v var="$line" '$1 != var' FileB
done < fileA >> Result
But since you are looping through the file many times, it will print the lines many, many times. That's why you have to go all the way up to this answer and use grep -vf fileA fileB
.
Upvotes: 10