Reputation: 37
I have two files. A mapping file and an input file.
cat map.txt
test:replace
cat input.txt
The word test should be replaced.But the word testbook should not be replaced just because it has "_test" in it.
Using the below command to find in the file and replace it with value in mapping file.
awk 'FNR==NR{ array[$1]=$2; next } { for (i in array) gsub(i, array[i]) }1' FS=":" map.txt FS=" " input.txt
what it does is, searches for the text which are mentioned in map.txt and replace with the word followed after " : " in the same input file. In the above example "test" with "replace".
Current result:
The word replace should be replaced.But the word replacebook should not be replaced just because it has _replace in it.
Expected Result:
The word replace should be replaced.But the word testbook should not be replaced just because it has "_test" in it.
so what i need is only if that word alone is found it has to be replaced. If that word has any other character clubbed then it should be ignored.
Any help is appreciated.
Thanks in advance.
Upvotes: 0
Views: 196
Reputation: 37454
for
loop all the words and replace where needed:
$ awk '
NR==FNR { # hash the map file
a[$1]=$2
next
}
{
for(i=1;i<=NF;i++) # loop every word and if it s hashed, replace it
if($i in a) # ... and if it s hashed...
$i=a[$i] # replace it
}1
' FS=":" map FS=" " input
The word replace should be replaced.But the word testbook should not be replaced just because it has "_test" in it.
Edit: Using match
to extract words from strings to preserve punctuations:
$ cat input2
Replace would Yoda test.
$ awk '
NR==FNR { # hash the map file
a[$1]=$2
next
}
{
for(i=1;i<=NF;i++) {
# here should be if to weed out obvious non-word-punctuation pairs
# if($i ~ /^[a-zA-Z+][,\.!?]/)
match($i,/^[a-zA-Z]+/) # match from beginning of word. ¿correct?
w=substr($i,RSTART,RLENGTH) # extract word
if(w in a) # match in a
sub(w,a[w],$i)
}
}1' FS=":" map FS=" " input
Replace would Yoda replace.
Upvotes: 1
Reputation: 204258
With GNU awk for word boundaries:
awk -F':' '
NR==FNR { map[$1] = $2; next }
{
for (old in map) {
new = map[old]
gsub("\\<"old"\\>",new)
}
print
}
' map input
The above will fail if old contains regexp metacharacters or escape characters or if new contains &
but as long as both use word consituent characters it'll be fine.
Upvotes: 1