Reputation: 3123
I have such a string "++++001------zx.......?????????xxxxxxx" I would like to extract the more than one length continuous sequences into a flattened array with a Ruby regex:
["++++",
"00",
"------",
".......",
"?????????",
"xxxxxxx"]
I can achieve this with a nested loop:
s="++++001------zx.......?????????xxxxxxx"
t=s.split(//)
i=0
f=[]
while i<=t.length-1 do
j=i
part=""
while t[i]==t[j] do
part=part+t[j]
j=j+1
end
i=j
if part.length>=2 then f.push(part) end
end
But I am unable to find an appropriate regex to feed into the scan method. I tried this: s.scan(/(.)\1++/x)
but it only captures the first character of the repeating sequences.
Is it possible at all?
Upvotes: 1
Views: 2258
Reputation: 626748
In case you need to get overall match values only while ignoring (omitting) all capturing group values, similarly to how String#match
works in JavaScript, you can use a String#gsub with a single regex argument (no replacement argument) to return an Enumerator, with .to_a
to get the array of matches:
text = "++++001------zx.......?????????xxxxxxx"
p text.gsub(/(.)\1+/m).to_a
# => ["++++", "00", "------", ".......", "?????????", "xxxxxxx"]
See the Ruby demo online and the Rubular demo (note how the matches are highlighted in the Match result field).
I added m
modifier just for completeness, for the .
to also match line break chars that a .
does not match by default.
Also, see a related Capturing groups don't work as expected with Ruby scan method thread.
Upvotes: 0
Reputation: 4421
This is a bit tricky.
You do want to capture any group that is more than one of any given character. So a good way to do this is using backreferences. Your solution is close to being correct.
/((.)\2+)/
should do the trick.
Note that if you use scan, this will return two values for each match group. The first being the sequence, and the second being the value.
Upvotes: 3
Reputation: 118261
str = "++++001------zx.......?????????xxxxxxx"
str.chars.chunk{|e| e}.map{|e| e[1].join if e[1].size >1 }.compact
# => ["++++", "00", "------", ".......", "?????????", "xxxxxxx"]
Upvotes: 1