Tcl_newbie
Tcl_newbie

Reputation: 135

Substring extraction in TCL

I'm trying to extract a sequence of characters from a string in TCL.
Say, I have "blahABC:blahDEF:yadamsg=abcd".
I want to extract the substring starting with "msg=" until I reach the end of the string.
Or rather I am interested in extracting "abcd" from the above example string.
Any help is greatly appreciated.
Thanks.

Upvotes: 1

Views: 14958

Answers (4)

Gabriele Serra
Gabriele Serra

Reputation: 531

The simplest solution is to use string first. Using that, you can retrieve the index of the first occurrence of the searched string.

% set s blahABC:blahDEF:yadamsg=abcd
% string first "msg=" $s
20

Then you can filter out the rest of the string using string range that extracts the portion of the string, given start/end index (or end if you are interested in extracting from given index to the end)

% string range $s [string first "msg=" $s] end
msg=abcd

If you want to get only 'abcd', then just add 4 to the start index (to exclude, basically, 'msg=' characters)

% string range $s [string first "msg=" $s]+4 end
abcd

Upvotes: 0

wolfhammer
wolfhammer

Reputation: 2661

Code

proc value_of {key matches} {

        set index [lsearch $matches "yadamsg"]

        if {$index != -1} {
                return [lindex $matches $index+1]
        }
        return ""
}

set x "blahABC:blahDEF:yadamsg=abcd:blahGHI"
set matches [regexp -all -inline {([a-zA-Z]+)=([^:]*)} $x]
puts [value_of "yadamsg" $matches]

Output:

abcd

update upvar not needed. see comments.

Upvotes: 1

glenn jackman
glenn jackman

Reputation: 246744

Another approach: split the query parameter using & as the separator, find the element starting with "msg=" and then get the text after the =

% set string blahblah&msg=abcd&yada
blahblah&msg=abcd&yada
% lsearch -inline [split $string &] {msg=*}
msg=abcd
% string range [lsearch -inline [split $string &] {msg=*}] 4 end
abcd

Upvotes: 1

joheid
joheid

Reputation: 418

Regular expressions are the tools for these kind of tasks. The general syntax in Tcl is:

regexp ?switches? exp string ?matchVar? ?subMatchVar subMatchVar ...?

A simple solution for your task would be:

 set string blahblah&msg=abcd&yada

 # match pattern for a =, 0-n characters which are not an & and one &. The grouping with {} is necessary due to special  charactaer clash between  tcl and re_syntax

 set exp {=([^&]*)&}

 # -> is an idiom. In principle it is the variable containing the whole match, which is thrown away and only the submatch is used
b
 regexp $exp $string -> subMatch

 set $subMatch

A nice tool to experiment and play with regexps ist Visual Regexp (http://laurent.riesterer.free.fr/regexp/). I'd recommend to download it and start playing.

The relevant man pages are re_syntax, regexp and regsub

Joachim

Upvotes: 2

Related Questions