Reputation: 63
First, apologies for the potentially duplicate question. I'm new to bash scripting and I can't even figure out some keywords to search with. With that said, I tried to simplify problem description as much as I can:
I have a text file (test.txt) that contains only this line:
REPLACE
I ran the following command which is supposed to replace file's text (i.e REPLACE
) with code variable value if (A & B)
.
code="if (A & B)" ; awk -v var="${code}" '{ gsub(/REPLACE/, var); print }' test.txt
Expected output I expect code
variable value to be printed as is:
if (A & B)
Actual output somehow the ampersand is expanded into 'REPLACE', which is gsub
regexp parameter:
if (A REPLACE B)
Perhaps I need to escape the ampersand but unfortunately, code
variable population is out of my control, so I can't manipulate its value manually.
FYI awk version is "GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)"
Thanks!
Upvotes: 6
Views: 2017
Reputation: 943
I had the same problem today, with the help of the above responses, I made the following awk function
function sanesub(rex, val) {
p=match($0, rex)
if (p != 0) {
$0=substr($0,1,p-1) val substr($0,p+RLENGTH)
}
return p
}
Upvotes: 0
Reputation: 11
You can just double escape the '&' character so your code would be
code="if (A \\\& B)" ; awk -v var="${code}" '{ gsub(/REPLACE/, var); print }' test.txt
Output:
# code="if (A \\\& B)" ; awk -v var="${code}" '{ gsub(/REPLACE/, var); print }' test.txt
if (A & B)
#
Note that in the above example you'll need to escape both the '\' and '&' characters which is why it's '\\\&'
If you didn't want to need to manipulate your input strings manually like the above example, then you could use an additional 'gsub' in your awk code to preprocess the input string to add the escape characters before running your 'gsub') as follows
code="if (A & B)" ; awk -v var="${code}" '{ gsub("&","\\\\&", var); gsub(/REPLACE/, var); print }' test.txt
Output:
# code="if (A & B)" ; awk -v var="${code}" '{ gsub("&","\\\\&", var); gsub(/REPLACE/, var); print }' test.txt
if (A & B)
#
Note the need for 4 '\' characters in the preprocessing gsub.
Upvotes: 1
Reputation: 203109
&
is a backreference metacharacter in many tools and it means "the string that matched the regexp you searched for". If you're trying to use literal strings then use literal strings instead of regexps and backreferences.
e.g.:
code="if (A & B)"
awk -v old="REPLACE" -v new="$code" 's=index($0,old){$0=substr($0,1,s-1) new substr($0,s+length(old))} 1' test.txt
The alternative, trying to santize regexps and replacements, is complicated and error prone and generally is not for the faint of heart, see: Is it possible to escape regex metacharacters reliably with sed
Upvotes: 8