misterwolf
misterwolf

Reputation: 424

Regular Expression group repeated letters

I am trying to group all the repeated letters in a string.

Eg:

"aaaaaaabbbbbbbbc" => [['aaaaaaa'],['bbbbbbbb'],['c']]

Using logic and Ruby, the only way I could find to reach my intention was:

.scan(/(?:a+|A+)|(?:b+|B+)|(?:c+|C+)| ..... (?:y+|Y+)|(?:z+|Z+))

where ... are the other alphabet letters.

There is a way to Dry that RegEx? I used backtrace (\1) too, but it doesn't match the single words and it doesn't return me the exact letters match => (\w+)\1 => [['aa'],['bb']]

Uhm, am I wrong to use the regular expressions for this case and I should use Ruby methods with iterations?

I will glad to hear your opinion :) Thanks!

Upvotes: 5

Views: 1598

Answers (4)

Avinash Raj
Avinash Raj

Reputation: 174874

Just use another capturing group to catch the repeated characters.

s.scan(/((\w)\2*)/).map(&:first)
# => ["aaaaaaa", "bbbbbbbb", "c"]

Upvotes: 7

Cary Swoveland
Cary Swoveland

Reputation: 110755

Here are a few other ways ways to do that. All return ["aaaaaaa", "bbbbbbbb", "c"]. If [["aaaaaaa"], ["bbbbbbbb"], ["c"]] is truly wanted (I can't imagine why), that's a simple extra step using map.

s.each_char.chunk(&:itself).map(&:join)

s.each_char.chunk_while { |a,b| b == a }.map(&:join)

s[1..-1].each_char.with_object([s[0]]) {|c,a| c == a.last[0] ? (a.last<<c) : a<< c}

s.gsub(/(.)\1*/).with_object([]) { |t,a| a << t }

In the last of these, String#gsub does not have a block, so it returns an enumerator (and does not perform any character replacement.) This use of gsub can be used to advantage in many situations.

Upvotes: 1

Oleksandr Holubenko
Oleksandr Holubenko

Reputation: 4440

One more solution without regexp :)

"aaaaaaabbbbbbbbc".chars.group_by(&:itself).values.map { |e| [e.join] }
 #=> [["aaaaaaa"], ["bbbbbbbb"], ["c"]]

Upvotes: 2

Sebasti&#225;n Palma
Sebasti&#225;n Palma

Reputation: 33491

Without using a regex you could take a look to Enumerable#slice_when:

string = "aaaaaaabbbbbbbbc"
p string.chars.sort.slice_when { |a, b| a != b }.map { |element| element.join.split }
# [["aaaaaaaa"], ["bbbbbbbb"], ["c"]]

Upvotes: 1

Related Questions