Compare two files having different column numbers and print the requirement to a new file if condition satisfies

Question

I have two files with more than 10000 rows:

File1 has 1 col      File2 has 4 col     
23                   23 88 90 0
34                   43 74 58 5
43                   54 87 52 3
54                   73 52 35 4 
.                    .
.                    .

I want to compare each value in file-1 with that in file-2. If exists then print the value along with other three values in file-2. In this example output will be:

I have written following script, but it is taking too much time to execute.

s1=1; s2=$(wc -l < File1.txt)
while [ $s1 -le $s2 ]
do n=$(awk 'NR=="$s1" {print $1}' File1.txt)
   p1=1; p2=$(wc -l < File2.txt)
   while [ $p1 -le $p2 ]
   do awk '{if ($1==$n) printf ("%s %s %s %s
", $1, $2, $3, $4);}'> ofile.txt
   (( p1++ ))
   done
(( s1++ ))
done

Is there any short/ easy way to do it?

nu11p01n73R · Accepted Answer

You can do it very shortly using awk as

awk 'FNR==NR{found[$1]++; next} $1 in found'

Test

>>> cat file1
23
34
43
54

>>> cat file2
23 88 90 0
43 74 58 5
54 87 52 3
73 52 35 4

>>> awk 'FNR==NR{found[$1]++; next} $1 in found' file1 file2
23 88 90 0
43 74 58 5
54 87 52 3

What it does?

FNR==NR Checks if FNR file number of record is equal to NR total number of records. This will be same only for the first file, file1 because FNR is reset to 1 when awk reads a new file.
- {found[$1]++; next} If the check is true then creates an associative array indexed by $1, the first column in file1
$1 in found This check is only done for the second file, file2. If column 1 value, $1 is and index in associative array found then it prints the entire line ( which is not written because it is the default action)

Compare two files having different column numbers and print the requirement to a new file if condition satisfies

Answers (1)

Related Questions