Reputation: 149
I have simple strings like below:
set x "\ \ a\ b\ \ a\ b\ b\ a\ \ \ "
I am trying to extract all occurrences of "a" and "b" by using the following regexp:
set match [regexp -all -inline {(\S+)} $x]
But that gives me:
a a b b a a b b b b a a
I was expecting:
a b a b b a
What am I doing wrong?
Thanks.
Upvotes: 0
Views: 819
Reputation: 137567
The -all -inline
option combination makes regexp
return a list of all the matches and capturing submatches that it finds, and your regular expression includes a capturing submatch that happens to be the same as the entire match.
Try this:
set match [regexp -all -inline {\S+} $x]
If you need non-capturing parentheses, use (?:…)
instead of (…)
.
If you have to have capturing groups because you're matching something more complex, you can filter the result with lmap
(8.6 or later) or foreach
:
set match [lmap {matched ignored} [regexp -all -inline {(\S+)} $x] {
set matched
}]
set match {}
foreach {matched ignored} [regexp -all -inline {(\S+)} $x] {
lappend match $matched
}
Note that we're using two iteration variables here and one list, so we pick of elements by twos. Using three iteration variables would pick off by threes, etc. (The lmap
command is just like foreach
except it produces a list of the values obtained by evaluating its body script, whereas foreach
throws those body script results away.)
Upvotes: 2