Raghavan
Raghavan

Reputation: 401

Finding existence each string in file1 in another file

Below is the content on both the files,

File:1

257054
256986
257144

File:2

257054|Guestroom|http://397_b.jpg|350|350||http://397/hotels/2000000/1330000/1321300/1321278/1321278_397_t.jpg|0
257057|Guestroom|http://398_b.jpg|350|350||http://398/hotels/2000000/1330000/1321300/1321278/1321278_398_t.jpg|0

I need a Bash command that will compare two files and the output contains only

257054|Guestroom|http://397_b.jpg|350|350||http://397/hotels/2000000/1330000/1321300/1321278/1321278_397_t.jpg|0

I can use normal for loop iteration, but it is very slow. I need some solution using awk or sed that has quick processing.

Upvotes: 0

Views: 149

Answers (3)

Inian
Inian

Reputation: 85780

You can do this in Awk in one shot,

awk 'BEGIN{FS=OFS="|"}FNR==NR{file1[$0]; next}$1 in file1' file1 file2

On file1 hash the contents into the index of array file1 and on file2 print those lines whose $1 is in seen.

Upvotes: 3

bishop
bishop

Reputation: 39434

If the contents of file1 can only appear in the first position of file2, you can use fgrep:

$ cat file1
257054
256986
257144
$ cat file2
257054|Guestroom|http://397_b.jpg|350|350||http://397/hotels/2000000/1330000/1321300/1321278/1321278_397_t.jpg|0
257057|Guestroom|http://398_b.jpg|350|350||http://398/hotels/2000000/1330000/1321300/1321278/1321278_398_t.jpg|0
$ fgrep -f file1 file2
257054|Guestroom|http://397_b.jpg|350|350||http://397/hotels/2000000/1330000/1321300/1321278/1321278_397_t.jpg|0

Note that you can substitute fgrep with grep -F: both are POSIX. Using the fgrep mode treats the contents of file1 as a set of literal patterns, one per line. Trying grep -f without -F will not give you the desired result.

In the event that the numbers from file1 could exist elsewhere in file2 besides the beginning of line, then you can create a more explicit match by combining grep with, eg, sed:

grep -f <(sed 's/.*/^&|/g' file1) file2

This matches the numbers from file1 only when they appear at the beginning of a line followed by a pipe (|).

Upvotes: 3

James Brown
James Brown

Reputation: 37424

You could also use join:

$ join -t \| f1 f2
257054|Guestroom|http://397_b.jpg|350|350||http://397/hotels/2000000/1330000/1321300/1321278/1321278_397_t.jpg|0

man join educates us:

NAME
       join - join lines of two files on a common field

SYNOPSIS
       join [OPTION]... FILE1 FILE2

       -t CHAR
              use CHAR as input and output field separator

Upvotes: 2

Related Questions