Jcmnia
Jcmnia

Reputation: 23

How can I search file1.txt and file2.txt for matching characters and print output to a new file

Problem: I need help with a task where I have two text files, file1.txt and file2.txt. The files have similar formats, but the names are on different line numbers, and they have different numbers of lines. The task is to check which names in file1.txt match the names in file2.txt, and then print the matching lines from file2.txt into a new file (file3.txt).

Example file formats: file1.txt:

NAME:FLAT
Jerome:Flat 6
Jimmy:Flat 4

file2.txt:

0:NAME:JOB:MONEY:FLAT
1:Bob:Developer:$500:Flat 7
2:Jerome:Gardener:$50:Flat 6
3:Cindy:Graphics:$100:Flat 5
4:Jimmy:Mod:$150:Flat 4

What I want to achieve: I want to compare the names in file1.txt (e.g., Jerome, Jimmy) and check if they also exist in file2.txt.

I want to output only the matching lines from file2.txt. Any names in file2.txt that don’t appear in file1.txt should be ignored. For example, "Bob" and "Cindy" appear in file2.txt, but not in file1.txt, so they should be ignored. The matching lines (like "Jerome" and "Jimmy") from file2.txt should be copied into a new file (file3.txt).

Example of expected output: If Jerome and Jimmy from file1.txt match the lines in file2.txt, the output file (file3.txt) should look like this:

file3.txt:

2:Jerome:Gardener:$50:Flat 6
4:Jimmy:Mod:$150:Flat 4

What I have tried: Here is the code I have tried so far, which uses awk to do the matching:

awk -F ":" 'FNR==NR{a[$1];next}($1 in a){print}' file2.txt file1.txt > file3.txt

What I need help with: If anyone could help me figure out whether this is possible or offer a better solution, I’d really appreciate it!

Upvotes: 1

Views: 248

Answers (2)

Cyrus
Cyrus

Reputation: 88869

With some GNU tools:

join -t ":" -1 1 -2 2 <(sed 1d File1.txt | sort) <(sort -t ":" -k 2,2 File2.txt) -o 2.1,2.2,2.3,2.4,2.5

Output:

2:Jerome:Gardener:$50:Flat 6
4:Jimmy:Mod:$150:Flat 4

See: info join and man sort

Upvotes: 1

RavinderSingh13
RavinderSingh13

Reputation: 133710

With your shown samples, could you please try following. Written and tested with GNU awk.

awk '
BEGIN  { FS=":" }
FNR==1 { next   }
FNR==NR{
  arr[$1]
  next
}
($2 in arr)
' file1.txt file2.txt

Explanation: Adding detailed explanation for above.

awk '                    ##Starting awk program from here.
BEGIN  { FS=":" }        ##Starting BEGIN section from here and setting FS as : here.
FNR==1 { next   }        ##Checking if this is first line in any of Input_file then simply go to next line.
FNR==NR{                 ##This condition will be TRUE when file1.txt is being read.
  arr[$1]                ##Creating array with $1 as key here.
  next                   ##next will skip all further statements from here.
}
($2 in arr)              ##Checking condition if 2nd fueld is in arr then print line from file2.txt
' file1.txt file2.txt    ##Mentioning Input_file names here.

Upvotes: 4

Related Questions