Reputation: 15
I have a string of consecutive same characters, like this: "aaabbc", which I want to group them in an array: ["aaa", "bb", "c"].
I've already attempted to solve it using Hash, and it really worked, but now I wonder if it is possible to solve using split and regex.
This is what I have done based in another answer from SO:
"aaabbc".split(/\\b([a-z])\\1+\\b/)
But it's giving me just the initial string in an array:
["aaabbc"]
Instead of giving each group of same consecutive characters separated by commas:
["aaa", "bb", "c"]
Upvotes: 0
Views: 1260
Reputation: 904
You can use String#scan
too.
It's a bit messier than the gsub
method, as the answer above alludes to:
"aaabbbbc".scan(/(.)(\1*)/)
#=> [["a", "aa"], ["b", "bbb"], ["c", ""]]
because scan
collects the matches in separate arrays, the first matched group (.)
is on its own each time but it's a simple matter to join
the groups up:
"aaabbbbc".scan(/(.)(\1*)/).map(&:join)
#=> ["aaa", "bbbb", "c"]
Upvotes: 0
Reputation: 110725
"aaabbc".gsub(/(.)\1*/).to_a
#=> ["aaa", "bb", "c"]
This uses the form of String#gsub when no block is given, in which case an enumerator is returned. In fact, this form of gsub
has nothing to do with string replacements; the enumerator merely generates matches. It overcomes a limitation of String#scan when capture groups are present.
The regular expression reads, "match any character, saving it in capture group 1, then match zero or more characters equal to the contents of capture group 1".
Upvotes: 7
Reputation: 21130
This answer does not use split
, but offers another alternative. You could make use of Enumerable#chunk_while
:
"aaabbc".each_char.chunk_while(&:==).map(&:join)
#=> ["aaa", "bb", "c"]
This first pulls apart the string into a list of characters, then it compares the consecutive elements using ==
. This creates an array of character arrays. Finally you convert each character array back to a string using join
.
Upvotes: 6