Reputation: 981
I need to convert a string with list with more than one elements (<>,abcd1,1,1
) as like below.
From:
test={abc([(<>,yifow3,1,1),(abc,yifow3,2,2,20140920,20151021),(<>,yifow3,3,3,20140920,20151021),(<>,yifow3,4,4)])}
To:
abc([(yifow3,1,1),(yifow3,2,2),(yifow3,3,3),(yifow3,4,4)])
I tried to extract the list inside abc([])
using the below regsub
. always it will have "abc([" at the begining and "])" at the end.
regsub -all {(abc\(\[)([a-z0-9\<\>\(\),]+)(\)\])} $test {\2} test2
then from test2
, using the for loop to extract the second, third, fourth items from each elements (<>,abcd1,1,1
).
Is there any simple way to extract using regsub/regex instead of for loop?
regex should extract second, third and fourth items ignoring first and fifth and sixth if they presents.
Upvotes: 0
Views: 1615
Reputation: 246799
regsub -all -expanded {
\( # a literal parenthesis
[^(,]+ , # 1 or more non-(parenthesis or comma)s and comma
( [^,]+ , \d+ , \d+ ) # the 3 fields to keep with commas
[^)]* # 0 or more non-parenthesis chars
\) # a literal parenthesis
} $test {(\1)}
returns
abc([(yifow3,1,1),(yifow3,2,2),(yifow3,3,3),(yifow3,4,4)])
Upvotes: 1
Reputation: 71538
Ok, based strictly on what you have in your question, you could first get all the things inside the innermost paren with a regex if you are already sure the string begins with abc([
and ends with ])
:
set test {abc([(<>,yifow3,1,1),(abc,yifow3,2,2,20140920,20151021),(<>,yifow3,3,3,20140920,20151021),(<>,yifow3,4,4)])}
set items [regexp -all -inline -- {\([^()]+\)} $test]
# (<>,yifow3,1,1) (abc,yifow3,2,2,20140920,20151021) (<>,yifow3,3,3,20140920,20151021) (<>,yifow3,4,4)
Then you can loop through each (split on comma, get the 2nd to 4th elements and join them back, etc).
I don't think you can avoid using a loop if you want to keep it simple. You can skip a few steps I guess with a more elaborate (no more simple!) regex:
set test {abc([(<>,yifow3,1,1),(abc,yifow3,2,2,20140920,20151021),(<>,yifow3,3,3,20140920,20151021),(<>,yifow3,4,4)])}
set items [regexp -all -inline -- {\([^,()]+((?:,[^,()]+){3})} $test]
set results [lmap {a b} $items {list [string trim $b ,]}]
# yifow3,1,1 yifow3,2,2 yifow3,3,3 yifow3,4,4
The regex here \([^,()]+((?:,[^,()]+){3})
matches as follows:
\( # Literal opening paren
[^,()]+ # Any character except ',', '(' and ')'
(
(?:,[^,()]+){3} # A comma followed by any character except ',', '(' and ')',
# the whole thing 3 times
)
I used lmap
(Tcl8.6) here which is basically a kind of loop. You can change it a bit to get the string you are looking for:
set results [lmap {a b} $items {list "([string trim $b ,])"}]
set output "abc(\[[join $results ,]])"
# abc([(yifow3,1,1),(yifow3,2,2),(yifow3,3,3),(yifow3,4,4)])
Upvotes: 1