user5786934
user5786934

Reputation:

Split by multiple delimiters

I'm receiving a string that contains two numbers in a handful of different formats:

"344, 345", "334,433", "345x532" and "432 345"

I need to split them into two separate numbers in an array using split, and then convert them using Integer(num).

What I've tried so far:

nums.split(/[\s+,x]/) # split on one or more spaces, a comma or x

However, it doesn't seem to match multiple spaces when testing. Also, it doesn't allow a space in the comma version shown above ("344, 345").

How can I match multiple delimiters?

Upvotes: 3

Views: 985

Answers (4)

Michael Gaskill
Michael Gaskill

Reputation: 8042

Your original regex would work with a minor adjustment to move the '+' symbol outside the character class:

"344 ,x  345".split(/[\s,x]+/).map(&:to_i) #==> [344,345]

If the examples are actually the only formats that you'll encounter, this will work well. However, if you have to be more flexible and accommodate unknown separators between the numbers, you're better off with the answer given by Wiktor:

"344 ,x  345".split(/\D+/).map(&:to_i) #==> [344,345]

Both cases will return an array of Integers from the inputs given, however the second example is both more robust and easier to understand at a glance.

Upvotes: 2

Cary Swoveland
Cary Swoveland

Reputation: 110755

R1 = Regexp.union([", ", ",", "x", " "])
  #=> /,\ |,|x|\ /
R2 = /\A\d+#{R1}\d+\z/
  #=> /\A\d+(?-mix:,\ |,|x|\ )\d+\z/

def split_it(s)
  return nil unless s =~ R2
  s.split(R1).map(&:to_i)
end

split_it("344, 345") #=> [344, 345] 
split_it("334,433")  #=> [334, 433] 
split_it("345x532")  #=> [345, 532] 
split_it("432 345")  #=> [432, 345] 
split_it("432&345")  #=> nil
split_it("x32 345")  #=> nil

Upvotes: 2

Sergio Tulentsev
Sergio Tulentsev

Reputation: 230551

it doesn't seem to match multiple spaces when testing

Yeah, character class (square brackets) doesn't work like this. You apply quantifiers on the class itself, not on its characters. You could use | operator instead. Something like this:

.split(%r[\s+|,\s*|x])

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627537

You are using a character class in your pattern, and it matches only one character. [\s+,x] matches 1 whitespace, or a +, , or x. You meant to use (?:\s+|x).

However, perhaps, a mere \D+ (1 or more non-digit characters) should suffice:

"345, 456".split(/\D+/).map(&:to_i)

Upvotes: 2

Related Questions