user1258104
user1258104

Reputation: 15

Merging two txt files by the list in first file and keeping the same list in output

I have two files, one has a list of codes and the other has a list of the codes with names and is pipe delimited.

Ex: )File 1:

00001
00002
00001
00003
00002
00004

File 2: *NOTE some names could be name 1 1 etc. see below for new example:

00001 | name1 1 1
00002 | name2 2
00003 | name3 3 3 3
00004 | name4 4 4 4 4

I need the output to remain with the same structure in file 1 but get the names from file 2 like this:

Output file:

00001 | name1 1 1
00002 | name2 2
00001 | name1 1 1
00003 | name3 3 3 3
00002 | name2 2
00004 | name4 4 4 4 4

And so on. I have been using a Perl script I found and modified to find matches in the file by line from the first file:

    #!/usr/bin/perl -w
    use strict;
    #FindTextInFile.pl
    my ($names, $data) = ("codesonly.txt", "codeandtext.txt");
    open (FILE1, $names) || die;
    open (FILE2, $data) || die;
    undef $/; #Enter "file-slurp mode" by emptying variable indicating end-of-record
    my $string = <FILE2>; #Read entire file to be searched into a string variable
    $/ = "\n"; #Restore default value to end-of-record variable

    while (<FILE1>) {
        chomp; #remove new-line character from end of $_
        #Use quotemeta() to fix characters that could spoil syntax in search pattern
        my $qmname = quotemeta($_);


        if ($string =~m/$qmname/i) {
                    print " $_  \n";
        }
        else {

        }

    }

I have also been using the FINDSTR function in the Windows CMD commands but that will not output line by line for me. I am very new to PERL so any help would be great or if there is an easier way to do this that would be very helpful. The files i will be using are ~1M lines so i need something that will be fast.

Thanks

Upvotes: 0

Views: 481

Answers (3)

user1258104
user1258104

Reputation: 15

Thanks for everyones reply. I was able to do what i needed with this code.

    open(file1, "<file1.txt");
    open(file2, "<file2.txt");

    while(<file2>){
            my($line) = $_;
            chomp $line;
            my($key, $value) = $line =~ /(.+)\|(.+)/;
            $file2Hash{$key} = $value;
    }

    while(<file1>){
            my($line) = $_;
            chomp $line;
            if(exists $file2Hash{$line}){print $line." | ".$file2Hash{$line}."\n";}
            else{print $line." | "."Error - Key not found in hash\n";}
    }

Upvotes: 0

Borodin
Borodin

Reputation: 126772

Something like this perhaps?

use strict;
use warnings;

my %codes = do {
  local $/;
  open my $fh, '<', 'f2.txt' or die $!;
  <$fh> =~ /\w+/g;
};

open my $fh, '<', 'f1.txt' or die $!;
while (<$fh>) {
  my ($key) = /(\w+)/;
  print "$key | $codes{$key}\n";
}

OUTPUT

00001 | name1
00002 | name2
00001 | name1
00003 | name3
00002 | name2
00004 | name4

Upvotes: 0

ikegami
ikegami

Reputation: 386706

Uses hashes for quick easly lookups.

my %rows;
{
   open(my $names_fh, '<', $names_qfn)
      or die("Can't open \"$names_qfn\": $!\n");

   while (<$names_fh>) {
      my ($id) = /^(\S+)/;
      $rows{$id} = $_;       
   } 
}

{
   open(my $index_fh, '<', $index_qfn)
      or die("Can't open \"$index_qfn\": $!\n");

   while (<$index_fh>) {
      chomp;
      print($rows{$_});
   }
}

Upvotes: 2

Related Questions