Reputation: 125
I have a list variable containing some values:
lappend list {query1}
{query2}
{query3}
And some data in file1 with parts of them matching the values above
query1 first data
query1 different data
query1 different data
query2 another data
query2 random data
query3 data something
query3 last data
How do I create a regexp loop that catches only the first instance found of each query and prints them out? In this case the output would be:
query1 first data
query2 another data
query3 data something
Attempted code to produce the output
set readFile1 [open file1.txt r]
while { [gets $readFile1 data] > -1 } {
for { set n 0 } { $n < [llength $list] } { incr n } {
if { [regexp "[lindex $list $n]" $data] } {
puts $data
}
}
}
close $readFile1
I tried using a for loop while reading the data from a file, but it seems to catch all values even if the -all option is not used.
Upvotes: 2
Views: 851
Reputation: 13252
package require fileutil
set queries {query1 query2 query3}
set result {}
::fileutil::foreachLine line file1.txt {
foreach query $queries {
if {![dict exists $result $query]} {
if {[regexp $query $line]} {
dict set result $query $line
puts $line
}
}
}
}
The trick here is to store the findings in a dictionary. If there is a value corresponding to the query in the dictionary already, we don’t search for it again. This also has the advantage that the found lines are available to the script after the search and aren’t just printed out. The regexp
search looks for the query string anywhere in the line: if it should only be in the beginning of the line, use regexp ^$query $line
instead.
Documentation: dict, fileutil package, foreach, if, package, puts, regexp, set
Upvotes: 2
Reputation: 246807
Not using regexp at all: I assume your "query"s do not contain whitespace
set list [list query1 query2 query3]
array set seen {}
set fh [open file1]
while {[gets $fh line] != -1} {
set query [lindex [split $line] 0]
if {$query in $list && $query ni [array names seen]} {
set seen($query) 1
puts $line
}
}
query1 first data
query2 another data
query3 data something
Upvotes: 1
Reputation: 16428
You can either read the file as a whole into a variable using read
command, if the text file is smaller in size. Apply the regexp
for the content and we can extract the required data.
set list {query1 query2 query3}
set fp [open file1.txt r]
set data [read $fp]
close $fp
foreach elem $list {
# '-line' flag will enable the line sensitive matching
if {[regexp -line "$elem.+" $data line]} {
puts $line
}
}
If suppose the file too large to hold or if you consider run-time memory usage, then go ahead with the reading the content line by line. There we need to have control on what already matched for which you can keep an array to maintain whether the first occurrence of any query matched or not.
set list {query1 query2 query3}
set fp [open file1.txt r]
array set first_occurence {}
while {[gets $fp line]!=-1} {
foreach elem $list {
if {[info exists first_occurence($elem)]} {
continue
}
if {[regexp $elem $line]} {
set first_occurence($elem) 1
puts $line
}
}
}
close $fp
Reference : regexp
Upvotes: 2
Reputation: 1482
Try This,
set fd [open "query_file.txt" r]
set data [read $fd]
set uniq_list ""
foreach l [split $data "\n"] {
lappend uniq_list [lindex $l 0]
}
set uniq_list [lsort -unique $uniq_list]
foreach l $uniq_list {
if {[string equal $l ""]} {
continue
}
foreach line [split $data "\n"] {
if {[regexp $l $line]} {
puts "$line"
break
}
}
}
close $fd
References: file , list , regexp
Upvotes: 1