Reputation: 21
The TCL regsub command seems to behave strangely when I give it strings which include escaped characters.
I have used autoexpect to capture a series of screen displays from an app for which I want to automate testing. Rather than attempt to use its output as a single block, I am attempting to turn the script generated into series of character strings to improve maintainability. I've used vi to create a series of fragments, which I then read in one at a time and use as matches with expect. I do have to do some substitution (for example "^[" becomes "ESC") but I've got to fragment 5, so the idea is generally working. Unfortunately I'm beaten by the substitution of "\[" with "[" in the pattern "xxxx\[[xxxx" (x's are other characters).
I've written a Tcl ascii string dump procedure, and I'm using that here.
% ascii_string_dump "\\\[" 0 8 pattern
*** ASCII dump of: pattern ( 2 characters) ***
---------------------------------------------------------------------
| 0000 | \ [ ... ... ... ... ... ... | 5c 5b .. .. .. .. .. .. |
| 0008 | ... ... ... ... ... ... ... ... | .. .. .. .. .. .. .. .. |
---------------------------------------------------------------------
% ascii_string_dump "a\\\[\[z" 0 8 test
*** ASCII dump of: test ( 5 characters) ***
---------------------------------------------------------------------
| 0000 | a \ [ [ z ... ... ... | 61 5c 5b 5b 7a .. .. .. |
| 0008 | ... ... ... ... ... ... ... ... | .. .. .. .. .. .. .. .. |
---------------------------------------------------------------------
%
% regsub -all "\\\[" "a\\\[\[z" "Z" newstring
2
% ascii_string_dump $newstring 0 8 newstring
*** ASCII dump of: newstring ( 5 characters) ***
---------------------------------------------------------------------
| 0000 | a \ Z Z z ... ... ... | 61 5c 5a 5a 7a .. .. .. |
| 0008 | ... ... ... ... ... ... ... ... | .. .. .. .. .. .. .. .. |
---------------------------------------------------------------------
%
In the above series, I first check that I can create the 2-character pattern "\[". I then create a pattern which is an abbreviated version of my real problem string, "a\[[z". Then I submit the regexp and test string to regsub, hoping to replace the "\[" characters with a single "Z". As you can see, two substitutions have occurred (rather than one) and there is an unexpected "\" at character 2!
Any enlightenment very welcome. (I've spent a lot of time on this (including writing the ascii dump proc!) but I'm getting nowhere...
Best wishes Allan
Upvotes: 0
Views: 81
Reputation: 71538
This is how regular expressions generally work in most languages.
If you use raw strings, your regsub command would look like this:
regsub -all {\[} {a\[[z} "Z" newstring
And in regular expressions, \[
represents the literal character [
(the \
is escaping the meta character [
which otherwise indicate the beginning of a character class).
If you want to replace the string \[
, then you need to replace a backslash and an opening square parenthesis, represented in regular expressions as: \\
and \[
, so your raw regsub becomes:
regsub -all {\\\[} {a\[[z} "Z" newstring
puts $newstring
# aZ[z
If you want to use quotes, you will need a lot more escaping to do. Each character in \\\[
will need to be escaped, basically, you need to add a backslash for each one of them:
regsub -all "\\\\\\\[" "a\\\[\[z" "Z" newstring
puts $newstring
# aZ[z
Or if you can use string map
:
string map {{\[} {Z}} {a\[[z}
or
string map {"\\\[" {Z}} "a\\\[\[z"
should do
Upvotes: 1