Reputation: 1
I am new to TCL. So I am asked to extract the start date from a file but I tried and there is no output. Please help.
From my file,there is this line i want to extract the start date:
Running final_step.step_done at: Wed Oct 11 02:04:03 MYT 2017
My code:
proc extract_data {} {
## To extract startdate
set file [open files/stages.files]
while {[gets $file line] >= 0} {
if {[regexp {^Running (\S+\s)at: (\S+.*)$} $line match Stage StartDate]} {
if {[regexp "[$CURRENT_STAGE]\.step_done" $Stage]} {
#set stage $Stage
set end_date $StartDate
set print_end_date [regsub -all " " $StartDate "_"]
#echo "2) $stage - $end_date"
} elseif {[regexp "^[$CURRENT_STAGE] " $Stage]} {
#set stage $Stage
set start_date $StartDate
set print_start_date [regsub -all " " $StartDate "_"]
#echo "1) $stage - $start_date"
}
}
}
Is there something wrong with my regexp?
Upvotes: 0
Views: 4153
Reputation: 13282
It seems to me you should be able to get a lot done with code like this:
while {[gets $file line] >= 0} {
if {[string match Running $line]} {
set Stage [lindex [split $line] 1]
set StartDate [lindex [string trim [split $line :]] end]
if {[string match *.step_done $Stage]} {
set end_date $StartDate
set print_end_date [string map {" " _} $StartDate]
} else {
set start_date $StartDate
set print_start_date [string map {" " _} $StartDate]
}
}
}
That is,
Stage
StartDate
$Stage
end_date
to $StartDate
and print_end_date
to the same string with all blanks replaced with underscoresstart_date
and print_start_date
Documentation: >= (operator), gets, if, lindex, set, split, string, while
Upvotes: 0
Reputation: 137787
The main RE looks fine — ^Running (\S+\s)at: (\S+.*)$
does indeed match the line that you're talking about — but these RE matches look suspicious:
regexp "[$CURRENT_STAGE]\.step_done" $Stage
regexp "^[$CURRENT_STAGE] " $Stage
In particular, you've got a command substitution in there with the name of the command coming from a variable. That's… valid in some circumstances, but quite an advanced technique; are you sure that's what you want? Also, the CURRENT_STAGE
variable appears to be undeclared. I'd expect one of these approaches to be more likely to work:
Here, we're using the qualified version of the variable name. Note that the variable had better contain a valid regular expression fragment, and we need to double up the backslash (because we're in a double-quoted context and not a braced context; one backslash is for the basic Tcl language, and the other is for the RE engine).
regexp "$::CURRENT_STAGE\\.step_done" $Stage
regexp "^$::CURRENT_STAGE " $Stage
Here, we're calling a command to get the actual stage. The command had better return a valid RE fragment, and as before, we're doubling up the backslash.
regexp "[CURRENT_STAGE]\\.step_done" $Stage
regexp "^[CURRENT_STAGE] " $Stage
In general, in both cases you might consider wrapping the part of the RE that represents the current stage in (?:
…)
, as that doesn't really change the semantics much, but does mean that the RE fragment can use features like alternation safely. Not that it matters when the RE fragment is a simple thing like final_step
.
Upvotes: 1