randomuser15995183
randomuser15995183

Reputation: 251

Difference of answers while using split function in Ruby

Given the following inputs:

line1 = "Hey | Hello | Good | Morning"
line2 = "Hey , Hello , Good , Morning"
file1=length1=name1=title1=nil

Using ',' to split the string as follows:

file1, length1, name1, title1 = line2.split(/,\s*/)

I get the following output:

puts file1,length1,name1,title1

>Hey
>Hello
>Good
>Morning

However, using '|' to split the string I receive a different output:

file1, length1, name1, title1 = line2.split(/|\s*/)
puts file1,length1,name1,title1

>H
>e
>y

Both the strings are same except the separating symbol (a comma in first case and a pipe in second case). The format of the split function I am using is also the same except, of course, for the delimiting character. What causes this variation?

Upvotes: 4

Views: 114

Answers (1)

nhahtdh
nhahtdh

Reputation: 56809

The problem is because | has the meaning of OR in regex. If you want literal character, then you need to escape it \|. So the correct regex should be /\|\s*/


Currently, the regex /|\s*/ means empty string or series of whitespace character. Since the empty string specified first in the OR, the regex engine will break the string up at every character (you can imagine that there is an empty string between characters). If you swap it to /\s*|/, then the whitespaces will be preferred over empty string where possible and there will be no white spaces in the list of tokens after splitting.

Upvotes: 7

Related Questions