Madza Farias-Virgens
Madza Farias-Virgens

Reputation: 1061

Print both records when one or another field falls in range defined in another file

I have file1

A 1 
A 7 
B 3  
B 7  

and file2

A 2 3 
A 6 8 
A 7 100 
B 1 3 
B 4 10 
B 700 800 

I am trying to print records in both files where for a matching $1 in both, either $2 OR $3 in file2 is between $2 and $2 + 5 in file1

So, that the output is

A 2 3 A 1 
A 6 8 A 7 
A 7 100 A 7
B 1 3 B 3
B 3 10 B 3    
B 3 10 B 7

I have the bellow working but I can only print records in one file

FNR==NR {
    n = ++q[$1]
    min[$1 FS n] = $2
    max[$1 FS n] = $2 + 5
    next
}

# process file1
n = q[$1] { # if no q entry, line cannot be in range
    for (i=1; i<=n; i++)
        if ( min[$1 FS i]<=$2 && $2<=max[$1 FS i] || min[$1 FS i]<=$3 && $3<=max[$1 FS i]) {
            print
            next
        }
}

If a field in file1 falls within multiple ranges in file2, the record can be repeated

Upvotes: 1

Views: 47

Answers (1)

tshiono
tshiono

Reputation: 22012

Would you please try the following:

awk '
# process file1
NR==FNR {
    q[++n] = $0
    f1[n] = $1
    f2[n] = $2
    next
}
# process file2
{
    for (i = 1; i <= n; i++) {
        if ($1 == f1[i] && ($2 >= f2[i] && $2 <= f2[i] + 5 ||
            $3 >= f2[i] && $3 <= f2[i] + 5))
            print $0, q[i]
    }
}' file1 file2

Output with the provided example files:

A 2 3 ... A 1 ...
A 6 8 ... A 1 ...
A 6 8 ... A 7  ...
A 7 100 ... A 7  ...
B 1 3 ... B 3  ...
B 4 10 ... B 3  ...
B 4 10 ... B 7  ...

Upvotes: 1

Related Questions