Reputation: 23
I have two files I am trying to compare the strings in each line by line. File1 only contains a 6 character string prefix while File2 contains a 12 character string. How can I loop through the File2 to find strings that start with the 6 characters from File1 and output those to a file?
002379
005964
002379ED6212
003354EB4591
004679BB2185
005964AB3379
005964DB5496
Upvotes: 2
Views: 374
Reputation: 183446
For a pure-Bash solution . . . assuming you're using Bash v4.x, you can first populate an associative array whose keys are the lines of File1
:
declare -A prefixes
while read prefix ; do
prefixes[$prefix]=1
done < File1
# Now ${prefixes[002379]} is 1, and ${prefixes[005964]} is 1, but
# ${prefixes[anything-else]} is undefined.
And then check the first six characters of each line of File2
to see if it's in this associative array:
while read word do ;
prefix="${word:0:6}"
if [[ "${prefixes[$prefix]}" ]] ; then
echo "$word"
fi
done < File2
Upvotes: 2
Reputation: 212354
grep -f <(sed 's/^/^/' file1) file2
It would be nice to just use grep -f
to find all the lines in file2 that match a regex in file1, but you want to anchor the regexes in file1 to the beginning of the line. So use the above to preprocess the strings by adding an anchor.
Upvotes: 2
Reputation: 74685
This awk one-liner does what you want:
awk 'NR==FNR{a[$0];next}{for(i in a)if(substr($0,1,6)==i)print}' file1 file2
NR==FNR
is only true for the first file. Each line of file1
is stored as a key in the array a
. next
skips the other block. For each record in the second file, loop through each of the keys in a
and compare the first 6 characters. If they are the same, print the record.
Output:
002379ED6212
005964AB3379
005964DB5496
Upvotes: 2
Reputation: 23364
awk
might be able to achieve this
awk 'NR == FNR {a[$0]; next};substr($0, 1, 6) in a' File1 File2
Upvotes: 2