Nadeem Yasin
Nadeem Yasin

Reputation: 4534

Retaining the pattern characters while splitting via Regex, Ruby

I have the following string

str="HelloWorld How areYou I AmFine"

I want this string into the following array

["Hello","World How are","You I Am", "Fine"]

I have been using the following regex, it splits correctly but it also omits the matching pattern, i also want to retain that pattern. What i get is

str.split(/[a-z][A-Z]/)
 => ["Hell", "orld How ar", "ou I A", "ine"] 

It omitts the matching pattern.

Can any one help me out how to retain these characters as well in the resulting array

Upvotes: 10

Views: 1953

Answers (3)

dbenhur
dbenhur

Reputation: 20398

Three answers so far, each with a limitation: one is rails-only and breaks with underscore in original string, another is ruby 1.9 only, the third always has a potential error with its special character. I really liked the split on zero-width assertion answer from @Alex Kliuchnikau, but the OP needs ruby 1.8 which doesn't support lookbehind. There's an answer that uses only zero-width lookahead and works fine in 1.8 and 1.9 using String#scan instead of #split.

str.scan /.*?[a-z](?=[A-Z]|$)/
=> ["Hello", "World How are", "You I Am", "Fine"]

Upvotes: 5

mfq
mfq

Reputation: 1377

I think this will do the job for you

str.underscore.split(/_/).each do |s| 
s.capitalize! 
end

Upvotes: -1

Aliaksei Kliuchnikau
Aliaksei Kliuchnikau

Reputation: 13739

In Ruby 1.9 you can use positive lookahead and positive lookbehind (lookahead and lookbehind regex constructs are also called zero-width assertions). They match characters, but then give up the match and only return the result, thus you won't loose your border characters:

str.split /(?<=[a-z])(?=[A-Z])/
=> ["Hello", "World How are", "You I Am", "Fine"] 

Ruby 1.8 does not support lookahead/lookbehind constructs. I recommend to use ruby 1.9 if possible.

If you are forced to use ruby 1.8.7, I think regex won't help you and the best solution I can think of is to build a simple state machine: iterate over each character in your original string and build first string until you encounter border condition. Then build second string etc.

Upvotes: 7

Related Questions