CLEX420
CLEX420

Reputation: 31

Regexp solution for list

I am not able to figure it following regexp, Script1:

set name_list {{Avg Speed} {Total Time(hr) }}
set lines {{Avg Speed|0.000} {Total Time(hr)|NA} }

set line_number 0
foreach line $lines {
    incr line_number

    foreach name $name_list {
        puts "flag1: $name"
        if { [regexp "^$name\\|(.+)$" $line matchVar value ] } {
            puts "flag4: $value"
            set rval($name) $value
            set rval($name,line) $line_number
        }
    }
}

The output is something like:
Output1:

    flag1: Avg Speed
    flag4: 0.000
    flag1: Total Time(hr) 
    flag1: Avg Speed
    flag1: Total Time(hr)

If i change first 2 lines like:
Script2(Modified):

set name_list {{Avg Speed} {Total Time}}
set lines {{Avg Speed|0.000} {Total Time|NA} }

I am getting something like this:
Output2 (Want output like this-Actual output)

   flag1: Avg Speed
   flag4: 0.000
   flag1: Total Time
   flag1: Avg Speed
   flag1: Total Time
   flag4: NA

So, how should i modify my regexp such that I get I get "(hr)" to be printed at output1 flag4 in first script?

regexp "^$name\\|(.+)$" $line matchVar value       

What is the purpose of matchVar exactly, beacuse here it is matching "0.000 & NA", but how?

Upvotes: 0

Views: 66

Answers (2)

Donal Fellows
Donal Fellows

Reputation: 137567

The purpose of matchVar is to capture the entire substring that is matched by the RE; when you anchor the RE at both ends, it is the same as the input string (if the RE matches at all) and can probably be ignored. It's much more relevant when your RE is unanchored or only anchored on one side, when the overall matched (sub)string provides you with useful information.

Your real problem though is that the string you are injecting into the RE to search for has itself got RE metacharacters in it; the (hr) is being treated as a sub-RE that happens to match hr and presents it as one of the submatched portions that regexp can report on. To fix this, we need to add backslashes to the RE metacharacters before substitution. We can do that with regsub -all as the RE sub-language guarantees that all metacharacters are not alphanumeric “word” characters, and can be quoted safely by putting a backslash in front of them.

# ...
foreach name $name_list {
    puts "flag1: $name"
    # Replace all non-word characters with their backslashed form
    regsub -all {\W} $name {\\&} REname
    if { [regexp "^$REname\\|(.+)$" $line matchVar value ] } {
        # ...

Upvotes: 1

Peter Lewerin
Peter Lewerin

Reputation: 13252

Here's your problem: you're using a regular expression where a simple string comparison is appropriate:

set name_list {{Avg Speed} {Total Time(hr)}}      ;# <- note that I've shaved off space
set lines {{Avg Speed|0.000} {Total Time(hr)|NA}}

set line_number 0
foreach line $lines {
    incr line_number

    foreach name $name_list {
        puts "flag1: $name"
        lassign [split $line |] _name value
        if {$_name eq $name} {
            puts "flag4: $value"
            set rval($name) $value
            set rval($name,line) $line_number
        }
    }
}

Documentation: eq (operator), foreach, if, incr, lassign, puts, set, split

Upvotes: 1

Related Questions