arthurborges
arthurborges

Reputation: 15

How to split a string by grouping consecutive same chars

I have a string of consecutive same characters, like this: "aaabbc", which I want to group them in an array: ["aaa", "bb", "c"].

I've already attempted to solve it using Hash, and it really worked, but now I wonder if it is possible to solve using split and regex.

This is what I have done based in another answer from SO:

"aaabbc".split(/\\b([a-z])\\1+\\b/)

But it's giving me just the initial string in an array:

["aaabbc"] 

Instead of giving each group of same consecutive characters separated by commas:

["aaa", "bb", "c"]

Upvotes: 0

Views: 1260

Answers (3)

ugy
ugy

Reputation: 904

You can use String#scan too. It's a bit messier than the gsub method, as the answer above alludes to:

"aaabbbbc".scan(/(.)(\1*)/)
  #=> [["a", "aa"], ["b", "bbb"], ["c", ""]]

because scan collects the matches in separate arrays, the first matched group (.) is on its own each time but it's a simple matter to join the groups up:

"aaabbbbc".scan(/(.)(\1*)/).map(&:join)
 #=> ["aaa", "bbbb", "c"]

Upvotes: 0

Cary Swoveland
Cary Swoveland

Reputation: 110725

"aaabbc".gsub(/(.)\1*/).to_a
  #=> ["aaa", "bb", "c"] 

This uses the form of String#gsub when no block is given, in which case an enumerator is returned. In fact, this form of gsub has nothing to do with string replacements; the enumerator merely generates matches. It overcomes a limitation of String#scan when capture groups are present.

The regular expression reads, "match any character, saving it in capture group 1, then match zero or more characters equal to the contents of capture group 1".

Upvotes: 7

3limin4t0r
3limin4t0r

Reputation: 21130

This answer does not use split, but offers another alternative. You could make use of Enumerable#chunk_while:

"aaabbc".each_char.chunk_while(&:==).map(&:join)
#=> ["aaa", "bb", "c"]

This first pulls apart the string into a list of characters, then it compares the consecutive elements using ==. This creates an array of character arrays. Finally you convert each character array back to a string using join.

Upvotes: 6

Related Questions