noobi
noobi

Reputation: 97

extract 1st line with specific pattern using regexp

I have a string

set text {show log

===============================================================================
Event Log 
===============================================================================
Description : Default System Log
Log contents  [size=500   next event=7  (not wrapped)]

6 2020/05/22 12:36:05.81 UTC CRITICAL: IOM #2001 Base IOM
"IOM:1>some text here routes "

5 2020/05/22 12:36:05.52 UTC CRITICAL: IOM #2001 Base IOM
"IOM:2>some other text routes "

4 2020/05/22 12:36:05.10 UTC MINOR: abc #2001 some text here also 222 def "

3 2020/05/22 12:36:05.09 UTC WARNING: abc #2011 some text here 111 ghj"

1 2020/05/22 12:35:47.60 UTC INDETERMINATE: ghe #2010 a,b, c="7" "
}

I want to extract the 1st line that starts with "IOM:" using regexp in tcl ie

IOM:1>some text here routes 

But implementation doesn't work, Can someone help here?

regexp -nocase -lineanchor -- {^\s*(IOM:)\s*\s*(.*?)routes$} $line match tag value

Upvotes: 1

Views: 244

Answers (2)

glenn jackman
glenn jackman

Reputation: 246764

In addition to @Wiktor's great answer, you might want to iterate over the matches:

set re {^\s*"(IOM):(.*)routes.*$}

foreach {match tag value} [regexp -all -inline -nocase -line -- $re $text] {
    puts [list $tag $value]
}
IOM {1>some text here }
IOM {2>some other text }

I see that you have a non-greedy part in your regex. The Tcl regex engine is a bit weird compared to other languages: the first quantifier in the regex sets the greediness for the whole regex.

set re {^\s*(IOM:)\s*\s*(.*?)routes$}   ; # whole regex is greedy
set re {^\s*?(IOM:)\s*\s*(.*?)routes$}  ; # whole regex in non-greedy
# .........^^

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You may use

regexp -nocase -- {(?n)^"IOM:.*} $text match
regexp -nocase -line -- {^"IOM:.*} $text match

See the Tcl demo

Details

  • (?n) - (same as -line option) newline sensitive mode ON so that . could not match line breaks ( see Tcl regex docs: If newline-sensitive matching is specified, . and bracket expressions using ^ will never match the newline character (so that matches will never cross newlines unless the RE explicitly arranges it) and ^ and $ will match the empty string after and before a newline respectively, in addition to matching at beginning and end of string respectively)

  • ^ - start of a line

  • "IOM: - "IOM: string
  • .* - the rest of the line to its end.

Upvotes: 3

Related Questions