Reputation: 15798
I want to do something like
if ($2 in another file) { print $0 }
So say I have file A.txt which contains
aa
bb
cc
I have B.txt like
00,aa
11,bb
00,dd
I want to print
00,aa
11,bb
How do I test that in awk? I am not familiar with the tricks of processing two files at a time.
Upvotes: 5
Views: 3685
Reputation: 483
This can be done by reading the first file and the storing the required column in an array. Remember awk stores arrays in key -> value pair.
#!/bin/sh
INPUTFILE="source.txt"
DATAFILE="file1.txt"
awk 'BEGIN {
while (getline < "'"$INPUTFILE"'")
{
split($1,ar,",");
for (i in ar) dict[ar[i]]=""
}
close("'"$INPUTFILE"'");
while (getline < "'"$DATAFILE"'")
{
if ($3 in dict) {print $0}
}
}'
source.txt --
121 sekar osanan
321 djfsdn jiosj
423 sjvokvo sjnsvn
file1.txt --
sekar osanan 424
djfsdn jiosj 121
sjvokvo sjnsvn 321
snjsn vog 574
nvdfi aoff 934
sadaf jsdac 234
kcf dejwef 274
Output --
djfsdn jiosj 121
sjvokvo sjnsvn 321
It just forms an array with the first coumn of source.txt and checks 3rd element of everyline in file1.txt with the array to see its availability. Like wise any column/operation can be performed with multiple files.
Upvotes: 0
Reputation: 477
Another way to do
awk -F, -v file_name=a.txt '{if(system("grep -q " $2 OFS file_name) == 0){print $0}}' b.txt
Upvotes: -1
Reputation: 67467
alternative with join
if the files are both sorted on the joined field
$ join -t, -1 1 -2 2 -o2.1,2.2 file1 file2
00,aa
11,bb
set delimiter to comma, join first field from first file with second field from second file, output fields swapped. If not sorted you need to sort them first, but then awk
might be a better choice.
Upvotes: 2
Reputation: 116700
There seem to be two schools of thought on the matter. Some prefer to use the BEGIN-based idiom, and others the FNR-based idiom.
Here's the essence of the former:
awk -v infile=INFILE '
BEGIN { while( (getline < infile)>0 ) { .... } }
... '
For the latter, just search for:
awk 'FNR==NR'
Upvotes: 1
Reputation: 74595
You could use something like this:
awk -F, 'NR == FNR { a[$0]; next } $2 in a' A.txt B.txt
This saves each line from A.txt
as a key in the array a
and then prints any lines from B.txt
whose second field is in the array.
NR == FNR
is the standard way to target the first file passed to awk, as NR
(the total record number) is only equal to FNR
(the record number for the current file) for the first file. next
skips to the next record so the $2 in a
part is never reached until the second file.
Upvotes: 7