Comparing two files for duplicate strings using mac bash

Question

I am trying to go through two .txt files to find matches and only print unique lines. I have a list of numbers and names and a second list of only partial numbers.

The file formats are like this
5553239090,batman

second file is only six numbers
555323

I want to make sure that I am only taking out the lines that match the first six numbers so if I had 239090 I would not want that to take out the line. So far what I have found that works is this

while read line

do
echo $line
    while read line2

        do

            if [[ "${line:0:6}" != "$line2" ]]; then

                echo $line >> uniqueList2.txt
                echo ${line:0:6}
            fi

        done < file2.txt

done < file1.txt

for some reason though it is not taking out all of the matches only some of them. It seems like the it matches the numbers at the top of file2 the best and the further down the list the are the more likely it is to miss it.

Is there something I missed or a better way to do this?

anubhava · Accepted Answer

You can use awk like this:

awk 'FNR==NR {a[$1];next} {for (i in a) if (index($0, i)==1) print}' file2 file1

OR else you can use this grep

grep -f <(sed 's/^/\^/' file1) file2

Comparing two files for duplicate strings using mac bash

Answers (1)

Related Questions