Xosen
Xosen

Reputation: 13

Search for lines in a file that contain de lines of a second file

So I have a first file with a ID in each line, for example:

458-12-345
466-44-3-223
578-4-58-1
599-478
854-52658
955-12-32

Then I have a second file. It has a ID in each file followed by information, for example:

111-2457-1 0.2545 0.5484 0.6914 0.4222
112-4844-487 0.7475 0.4749 0.1114 0.8413
115-44-48-5 0.4464 0.8894 0.1140 0.1044

....

The first file only has 1000 lines, with the IDs of the info I need, while the second file has more than 200,000 lines.

I used the following bash command in a fedora with good results:

cat file1.txt | while read line; do cat file2.txt | egrep "^$line\ "; done > file3.txt

However I'm now trying to replicate the results in Ubuntu, and the output is a blank file. Is there a reason for this not to work in Ubuntu?

Thanks!

Upvotes: 1

Views: 134

Answers (3)

Hai Vu
Hai Vu

Reputation: 40723

You can grep for several strings at once:

grep -f id_file data_file

Assuming that id_file contains all the IDs and data_file contains the IDs and data.

Upvotes: 2

ripat
ripat

Reputation: 3236

Typical job for awk:

awk 'FNR==NR{i[$1]=1;next} i[$1]{print}' file1 file2

This will print the lines from the second file that have an index in the first one. For even more speed, use mawk.

Upvotes: 1

bpgergo
bpgergo

Reputation: 16037

this line works fine for me in Ubuntu:

cat 1.txt | while read line; do cat 2.txt | grep "$line"; done

However, this may be slow as the second file (200000 lines) will be grepped 1000 times (number of lines in the first file)

Upvotes: 0

Related Questions