Reputation: 28622
I'm using RUBY 's regular expression to deal with text such as
${1:aaa|bbbb}
${233:aaa | bbbb | ccc ccccc }
${34: aaa | bbbb | cccccccc |d}
${343: aaa | bbbb | cccccccc |dddddd ddddddddd}
${3443:a aa|bbbb|cccccccc|d}
${353:aa a| b b b b | c c c c c c c c | dddddd}
I want to get the trimed text between each pipe line. For example, for the first line of my upper example, I want to get the result aaa and bbbb, for the second line, I want aaa, bbbb and ccc ccccc. Now I have wrote a piece of regular expression and a piece of ruby code to test it:
array = "${33:aaa|bbbb|cccccccc}".scan(/\$\{\s*(\d+)\s*:(\s*[^\|]+\s*)(?:\|(\s*[^\|]+\s*))+\}/)
puts array
Now my problem is the (?:\|(\s*[^\|]+\s*))+
part can't create multiple groups. I don't know how to solve this problem, because the number of text I need in each line is variable. Can anyone help?
Upvotes: 1
Views: 771
Reputation: 22009
When you repeat a capturing group in a regular expression, the capturing group only stores the text matched by its last iteration. If you need to capture multiple iterations, you'll need to use more than one regex. (.NET is the only exception to this. Its CaptureCollection
provides the matches of all iterations of a capturing group.)
In your case, you could do a search-and-replace to replace ^\d+:
with nothing. That strips off the number and colon at the start of your string. Then call split()
using the regex \s*\|\s*
to split the string into the elements delimited by vertical bars.
Upvotes: 1
Reputation: 42421
Instead of trying to do everything at once, divide and conquer:
DATA.each do |line|
line =~ /:(.+)\}/
items = $1.strip.split( /\s* \| \s*/x )
p items
end
__END__
${1:aaa|bbbb}
${233:aaa | bbbb | ccc ccccc }
${34: aaa | bbbb | cccccccc |d}
${343: aaa | bbbb | cccccccc |dddddd ddddddddd}
${3443:a aa|bbbb|cccccccc|d}
${353:aa a| b b b b | c c c c c c c c | dddddd}
If you want to do it with a single regex, you can use scan
, but this seems more difficult to grok:
DATA.each do |line|
items = line.scan( /[:|] ([^|}]+) /x ).flatten.map { |i| i.strip }
p items
end
Upvotes: 1
Reputation: 77796
This might help you
a = [
'${1:aaa|bbbb}',
'${233:aaa | bbbb | ccc ccccc }',
'${34: aaa | bbbb | cccccccc |d}',
'${343: aaa | bbbb | cccccccc |dddddd ddddddddd}',
'${3443:a aa|bbbb|cccccccc|d}',
'${353:aa a| b b b b | c c c c c c c c | dddddd}'
]
a.each do |input|
puts input
input.scan(/[:|]([^|}]+)/).flatten.each do |s|
puts s.gsub(/(^\s+|\s+$)/, '') # trim
end
end
${1:aaa|bbbb}
aaa
bbbb
${233:aaa | bbbb | ccc ccccc }
aaa
bbbb
ccc ccccc
${34: aaa | bbbb | cccccccc |d}
aaa
bbbb
cccccccc
d
${343: aaa | bbbb | cccccccc |dddddd ddddddddd}
aaa
bbbb
cccccccc
dddddd ddddddddd
${3443:a aa|bbbb|cccccccc|d}
a aa
bbbb
cccccccc
d
${353:aa a| b b b b | c c c c c c c c | dddddd}
aa a
b b b b
c c c c c c c c
dddddd
Upvotes: 1
Reputation: 83680
Why don't you split your string?
str = "${233:aaa | bbbb | ccc ccccc }"
str.split(/\d+|\$|\{|\}|:|\|/).select{|v| !v.empty? }.select{|v| !v.empty? }.map{|v| v.strip}.join(', ')
#=> "aaa, bbb, cc cccc"
Upvotes: 1