user1491587
user1491587

Reputation: 85

replace text based on a dictionary

I need to do something similar to this post (but with a twist). That is why I am asking.

unix shell: replace by dictionary

I have a dictionary(dict.txt). It is space separated and it reads like this:

V7 Momentum

B6 Quanta

....

(the first column is key and the second column is value, in a sense)

I have a user file (user.txt), it contains the occurrences of the keys (V7, B6 etc). The twist is that keys are not in its own column (so the method in the above post does not apply).

The user file (user.txt) can be view as a stream of characters. I just want to replace all occurrences of the keys (e.g., V7), regardless they are bounded by space or bounded by other character by the value (Momentum) looked up from the dictionary.

For example:

"We have V7 as input" --> should change to --> "We have Momentum as input"

"We have somethingV7_as input" -->should change to --> "We have somethingMomentum_as input"

Upvotes: 4

Views: 7746

Answers (3)

slitvinov
slitvinov

Reputation: 5768

Usage: awk -f foo.awk dict.dat user.dat
http://www.gnu.org/software/gawk/manual/html_node/String-Functions.html
http://www.gnu.org/software/gawk/manual/html_node/Arrays.html

NR == FNR {
  rep[$1] = $2
  next
} 

{
  for (key in rep)
    gsub(key, rep[key])
  print
}

Upvotes: 11

potong
potong

Reputation: 58430

This might work for you (GNU sed):

sed '/./!d;s/\([^ ]*\) *\(.*\)/\\|\1|s||\2|g/' dict.txt | sed -f - user.txt

Upvotes: 3

Borodin
Borodin

Reputation: 126722

As long as your dictionary keys contain nothing but alphanumeric characters, this Perl will do what you need.

use strict;
use warnings;

open my $fh, '<', 'dict.txt' or die $!;
my %dict =  map { chomp; split ' ', $_, 2 } <$fh>;
my $re = join '|', keys %dict;

open $fh, '<', 'user.txt' or die $!;
while (<$fh>) {
  s/($re)/$dict{$1}/g;
  print;
}

Upvotes: 3

Related Questions