Dave
Dave

Reputation: 19220

How do I split on multiple white space or tabs?

Hey in Ruby how do you split on multiple white space or a tab character? I tried this

2.4.0 :003 > a = "b\tc\td"
 => "b\tc\td" 
2.4.0 :005 > a.strip.split(/([[:space:]][[:space:]]+|\t)/)
 => ["b", "\t", "c", "\t", "d"]

but the tabs themselves are getting turned into tokens and that's not what I want. The above should return

["b", "c", "d"]

Upvotes: 1

Views: 1990

Answers (3)

chitresh
chitresh

Reputation: 346

There are some easy approaches than accepted solution:

a.strip.split("\s")

or

a.split("\s")

'\s' will take care for multiple whitespaces characters.

for above case you can simply use:

a = "b\tc\td" 
a.split("\t")    #=> ["b", "c", "d"]

or for combination of multiple spaces and tabs

a.gsub("\t", " ").split("\s")     #=> ["b", "c", "d"]

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627103

It happens because the group you used is a capturing one. See split reference:

If pattern contains groups, the respective matches will be returned in the array as well.

Use a non-capturing group (used only for grouping patterns) to avoid adding matched strings into the resulting array:

a.strip.split(/(?:[[:space:]][[:space:]]+|\t)/)
                ^^

Upvotes: 2

coreyward
coreyward

Reputation: 80090

In this instance you can use a character class that includes both spaces and tabs in your regular expression:

"b\tc\td".split /[ \t]+/

If you want to split on any whitespace, you can also use the [\s]+ notation, which matches all whitespace characters.

Upvotes: 0

Related Questions