okapiho
okapiho

Reputation: 89

Grep within Perl substitution (RHS)

I would like to replace text based on a "string string" pairs defined in file $key.

Sample Input file $input

a b c foo
d e f moo
g h i boo

Predefined "Key" file $key

cow moo
code foo
ghost boo
cheer woo

Desired Output

a b c code
d e f cow
g h i ghost

My Attempt

perl -pe 's/(.*?)(\woo)/$1qq{grep -oP ".*(?=\s$2)" $key}/e' $input > $output

error returned

syntax error at -e line 1, near "$1qq{grep -oP ".*(?=\s$2)" $key}"
syntax error at -e line 1, near "s/(.*?)(\woo)/$1qq{grep -oP ".*(?=\s$2)" $key}/ee"

Any help would be appreciated.

Suggestions for a better approach to achieve the desired result are very much welcome, but an accepted answer would ideally also include a solution or comment on using perl substition.

Upvotes: 0

Views: 667

Answers (4)

ikegami
ikegami

Reputation: 386416

$1qq{grep -oP ".*(?=\s$2)" $key}

is not a valid Perl expression. Maybe you meant

$1 . qq{grep -oP ".*(?=\s$2)" $key}

although there are numerous other errors in that expression. (You used qq{} where you should have used qx{}, you forgot to escape the \, you used $key without having assigned a value to it, maybe more.)

Maintainable solution that only reads the key file once:

perl -e'
   my %lookup;
   open(my $fh, "<", shift(@ARGV))
      or die $!;

   while (<$fh>) {
      my ($v,$k) = split;
      $lookup{$k} = $v;
   }

   while (<>) {
      my @f = split;

      next if !@f;  # Skip blank lines.

      if (defined($lookup{$f[3]})) {
         warn("Can'\''t find key \"$f[3]\". Copying record unchanged.\n");
         print;
         next;
      }

      $f[3] = $lookup{$f[3]};
      print("@f\n");
   }
' keyfile.txt input.txt >output.txt

Upvotes: 3

mpapec
mpapec

Reputation: 50667

Using perl from command line,

perl -lane'
  BEGIN{ local @ARGV = pop; %h = reverse map split, <> }
  print join " ", @F[0..2], $h{$F[3]};

' input key

output

a b c code
d e f cow
g h i ghost

update

perl -lane'
  BEGIN{ local @ARGV = pop; %h = reverse map /(.+)\s+(\S+)$/, <> }
  print join " ", @F[0..2], $h{$F[3]};

' input key

Upvotes: 4

Jotne
Jotne

Reputation: 41460

Here is how you can use awk

awk 'FNR==NR {a[$2]=$1;next} $NF=a[$NF]' key input
a b c code
d e f cow
g h i ghost

It reads the key file to array a
Then print input file using keys for array a to change last field.

If a[$NF] may be 0, use:

awk 'FNR==NR {a[$2]=$1;next} {$NF=a[$NF];print}' key input

Upvotes: 3

Sobrique
Sobrique

Reputation: 53498

Personally - I don't like doing one liners, because they're hard to read.

The general trick for pattern replacement is this:

my %replacements;
open ( my $keyfile, "<", "key_file.txt" ) or die $!;
while ( $keyfile ) {
     chomp;
     my ( $value, $key ) = split;
     $replacements{$key} = $value; 
}

my $regex = join ( "\b|\b", keys %replacements ); 
$regex = qr/$regex/; 

open ( my $replace_fh, "<", "input_file" ) or die $!; 
while ( <$replace_fh> ) {
    s/\b($regex)\b/$replacements{$1}/g;
    print;
}

Which turns your input into a hash of replacements, constructs a regular expression that matches any word in it, and then uses that regex to 'match' - using $1 a lookup key for the hash.

Upvotes: 2

Related Questions