Reputation: 79
How do I pattern match this in TCL?
set game1 base_ball_ABC_10_100_a_b_c
set game2 base_ball_CDE_20_200_d_e_f
set game3 base_bat_DEF_40_400_j_k_l
The regexp arg below does not seem to work since it will match up to 1st digit (ie: _10, _20)?
if { ![regexp {^([^0-9]+)(.*)$} $k3 - bb digit] } {
I wonder what would be the regexp to match base_ball
, and AB_10_100_a_b_c
? The above regexp seems to match up to a DIGIT
?
I would like it to match [base_ball or base_bat]
and then match [ABC_10_100, CDE_20_200]
.
Upvotes: 0
Views: 155
Reputation: 71538
You don't need to use regexp
if the strings have an equal number of underscores, and if all your need is always before and after the 2nd underscore:
set k3 {base_ball_ABC_10_100_a_b_c}
set parts [split $k3 "_"]
set part1 [join [lrange $parts 0 1] "_"]
# base_ball
set part2 [join [lrange $parts 2 end] "_"]
# ABC_10_100_a_b_c
Taking your previous question in consideration as well, you may not need to join either, if you're only counting the unique values, so doing
set k3 {base_ball_ABC_10_100_a_b_c}
set parts [split $k3 "_"]
set part1 [lrange $parts 0 1]
set part2 [lrange $parts 2 end]
should probably be enough.
If you still want to use regexp, then I'd advise reading re_syntax, and the expression really depends on a lot of things. There can be a lot of different expressions that work with your data, but the best expression can only be crafted by knowing about the data, and edge cases. From what I can assume, I'd guess something like this may work:
regexp {^([^_]+_[^_]+)_(.+)$} $k3 - bb digit
Where [^_]+
means that any character will be matched, except for _
s, so that the above matches:
^
- beginning of string[^_]+
- any non _
s_
- one _
[^_]+
any non _
s_
one _
.+
- any characters$
- end of stringUpvotes: 2