Reputation: 15
I’m facing a problem
1) I got a list_file intended to be used for inlace replacement like this
Replacement pattern ; Matching patterns
EXTRACT ___________________
toto ; tutu | tata | tonton | titi
bobo ; bibi | baba | bubu | bebe
etc. 14000 lines !!!
_____________________________
2) I got a target file in witch I want to replace thoses paterns
EXTRACT INPUT _______________
hello my name is bob and I am a Titi and I like bubu
_____________________________
I want it to become
EXTRACT OUTPUT ______________
hello my name is bob and I am a toto and I like bobo
_____________________________
for example with one replacement :
echo 'toto; tutu | tata | tonton | titi ' | awk '{gsub(/ tutu | tata | tonton | titi /," toto ")}1'
gives
toto; toto | toto | toto | toto
with
awk -F';' 'NR==FNR{A[$1]=$2; next} IGNORECASE = 1 {for(i in A) gsub(/A[i]/,i)}1’
I expect to :
Sadly awk doesn’t seems to understand the pipe « | » character as a OR indicator … I have also tried to achieve this with sed but this option goes very slowly aven if it works :(
does anyone have a better idea ? Thanks M
Upvotes: 0
Views: 729
Reputation: 203169
By putting the array reference inside regexp delimiters you're turning A[i]
into literal characters in the regexp instead of an array that contains a regexp indexed by a string. Just don't do that. Also your placement of setting IGNORECASE makes no sense. Try this:
awk -F';' 'BEGIN{IGNORECASE = 1} NR==FNR{A[$1]=$2; next} {for(i in A) gsub(A[i],i)}1'
I'm not saying it's a good idea but it might give you the output you're looking for. Stop using the word "pattern" btw as patterns are for quilts and sweaters - in text matching and replacing use either regexp
or string
, whichever one you mean in every context. You'll find it much easier to write and understand code if you understand where regexps vs strings occur.
Upvotes: 1