Reputation: 165
I'm trying to use ActiveState TCL on a windows PC to run the following TCL. looks like i'm doing a non-greedy match between #\(.*?\) and its matching greedy into the next statements... Any idea what i'm doing wrong or how to fix this?
proc extract_verilog_instances {text} {
set rexp {(\w+)\s+(\#\s*\((?:.*?)\)\s*)?(\w+(?:\[\d+\])?)\s*\(}
# rexp will match any of the following statement types:
#
# module_name instance_name (
# module_name instance_name[0] (
# module_name #(parameter1, parameter2) instance_name (
# module_name #(parameter1, parameter2) instance_name[0] (
set regrun [regexp -inline -all -indices -expanded $rexp $text]
foreach {m0 m1 m2 m3} $regrun {
set start_index [lindex $m0 0]
set end_index [lindex $m0 1]
set module [string range $text [lindex $m1 0] [lindex $m1 1]]
set instance [string range $text [lindex $m3 0] [lindex $m3 1]]
puts "module:$module instance:$instance"
}
}
set vlog {
second_module #(2) inst2 (.in2(sig2), .out2(sig3));
third_module inst3 (.in3(sig3), .out3(sig4));
fourth_module #(.in4_clk_freq(50), .in4_rst_val(1'b0)) inst4 (.in4_clk(clk), .in4_rst(rst), .in4_in1(sig4), .in4_in2(sig5), .out4(sig6));
}
extract_verilog_instances $vlog
proc extract_verilog_instances5 $vlog
Expected output:
module:second_module instance:inst2
module:third_module instance:inst3
module:forth_module instance:inst4
Actual output:
module:second_module instance:inst4
Upvotes: 2
Views: 59
Reputation: 627341
You can use
(\w+?)\s+(#\s*\(.*\)\s*)?(\w+(?:\[\d+\])?)\s*\(
In a Tcl regex, greediness is set with the first quantifier in the pattern. So, if you use \w+?
as the first quantified subpattern, all subsequent patterns with +
or *
will automatically turn into +?
and *?
.
If you want to test this regex in a PCRE compliant regex tester, the pattern above should be written as
(\w+?)\s+?(#\s*?\(.*?\)\s*?)?(\w+?(?:\[\d+?\])??)\s*?\(
See the regex demo.
This regex works for you because \w+?
at the start of the pattern will work the same as \w+
because it is followed with an obligatory \s
, and all the rest lazy patterns work because of the obligatory patterns following them (\(
is very good and important here).
Upvotes: 2