sarbjit
sarbjit

Reputation: 3894

Regex pattern behaving differently in TCL compared with Perl & Python

I am trying to extract a sub-string from a string using regular expressions. Below is the working code in Python (giving desired results)

Python Solution

x = r'CAR_2_ABC_547_d'
>>> spattern = re.compile("CAR_.*?_(.*)")
>>> spattern.search(x).group(1)
'ABC_547_d'
>>>

Perl Solution

$ echo "CAR_2_ABC_547_d" | perl -pe's/CAR_.*?_(.*)/$1/'
ABC_547_d

TCL Solution

However, when I try to utilize this approach in Tcl, it is giving me different results. Can someone please comment on this behavior

% regexp -inline "CAR_.*?_(.*)" "CAR_2_ABC_547_d"
CAR_2_ {}

Upvotes: 2

Views: 533

Answers (2)

glenn jackman
glenn jackman

Reputation: 247042

Another approach, instead of capturing the text that follows the prefix, is to just remove the prefix:

% set result [regsub {^CAR_.*?_} "CAR_2_ABC_547_d" {}]
ABC_547_d

Upvotes: 1

Dinesh
Dinesh

Reputation: 16438

A branch has the same preference as the first quantified atom in it which has a preference.

So if you have .* as the first quantifier, the whole RE will be greedy, and if you have .*? as the first quantifier, the whole RE will be non-greedy.

Since you have used the .*? in the first place itself, the further expression follow lazy mode only.

If you add end of line $, then it will match the whole.

% regexp -inline "CAR_.*?_(.*)$" "CAR_2_ABC_547_d"
CAR_2_ABC_547_d ABC_547_d

Reference : re_syntax

Upvotes: 4

Related Questions