user2189731
user2189731

Reputation: 558

Tcl regsub used with subst produces unexpected result

Edit: I was trying to replace "xor_in0" with "xor_in[0]" and "xor_in1" with "xor_in[1]" for a given str parameter. Here "xor_in0", "xor_in1" is parameter passed in and I represent it as "key", and "xor_in[0]", "xor_in[1]" is the value parameter stored in an array. Notice the point here is to replace every "key" in "str" with "value" . Here is my testing code:

set str "(xor_in0^xor_in1)"
set str1 "xor_in0^xor_in1" # another input
set key "xor_in0"
set value "xor_in\[0\]"
set newstr ""
set nonalpha  "\[^0-9a-zA-Z\]"
regsub -all [subst {^\[(*\]($key)($nonalpha+)}] $str [subst -nobackslashes {$value\2}] newstr
puts $newstr

But somehow it doesn't work... I also tried to remove [subst ...] and it still failed to match anything. This is somehow against my knowledge of regular expression. Please help.

Upvotes: 0

Views: 497

Answers (3)

Peter Lewerin
Peter Lewerin

Reputation: 13252

Simple is generally better:

regsub -all {\d+} $s {[&]} s

Takes care of your examples.

Upvotes: 0

user2189731
user2189731

Reputation: 558

Actually the real problem to my question is the A-z typo :)

Upvotes: 0

Donal Fellows
Donal Fellows

Reputation: 137567

Everything seems a bit over-complicated to me.

Let's look at the regsub that you're actually going to execute. There's a trick to doing that easily; if your command is:

regsub -all [subst {^\[(*\]($key)($nonalpha+)}] $str [subst -nobackslashes {$value\2}] newstr

Then we can print out what it's going to try to do with:

puts [list regsub -all [subst {^\[(*\]($key)($nonalpha+)}] $str [subst -nobackslashes {$value\2}] newstr]

That reveals that you're really doing this:

regsub -all {^[(*](xor_in0)([^0-9a-zA-z]+)} (xor_in0^xor_in1) {xor_in[0]\2} newstr

The part that looks a bit strange in there is the ([^0-9a-zA-z]+) at the end of the RE. It's legal but odd as we can write things a bit differently with \W for matching a non-alpha:

regsub -all {^[(*](xor_in0)(\W+)} $str {xor_in[0]\2} newstr

And that seems to work. What might the bug be then? The definition of nonalpha, as you're using "\[^0-9a-zA-z\]" instead of "\[^0-9a-zA-Z\]". Yes, a literal ^ lies in the ASCII (and Unicode) range from A to z


OTOH, I'd actually expect a transformation to really be done like this:

set newstr [regsub -all {(\y[a-zA-Z]+_in)(\d+)} $str {\1[\2]}]

The only things you're not used to there are \y (a word boundary constraint) and \d (match any digit). Or, for a simple transformation (mapping all instances of a literal substring to another literal substring):

set newstr [string map [list $key $value] $str]

Upvotes: 2

Related Questions