Elliot Cohen
Elliot Cohen

Reputation: 41

Help with Regex statement in Ruby

I have a string called 'raw'. I am trying to parse it in ruby in the following way:

raw = "HbA1C ranging 8.0—10.0%"
raw.scan /\d*\.?\d+[ ]*(-+|\342\200\224)[ ]*\d*\.?\d+/

The output from the above is []. I think it should be: ["8.0—10.0"].

Does anyone have any insight into what is wrong with the above regex statement?

Note: \342\200\224 is equal to (em-dash, U+2014).

The piece that is not working is: (-+|\342\200\224)

I think it should be equivalent to saying, match on 1 or more - OR match on the string \342\200\224.

Any help would be greatly appreciated it!

Upvotes: 0

Views: 384

Answers (2)

Caius
Caius

Reputation: 734

The original regex works for me (ruby 1.8.7), justs needs the capture to be non-capturing and scan will output the entire match. Or switch to String#[] or String#match instead of String#scan and don't edit the regex.

raw = "HbA1C ranging 8.0—10.0%"
raw.scan /\d*\.?\d+[ ]*(?:-+|\342\200\224)[ ]*\d*\.?\d+/
# => ["8.0—10.0"]

For testing/building regular expressions in Ruby there's a fantastic tool over at http://rubular.com that makes it a lot easier. http://rubular.com/r/b1318BBimb is the edited regex with a few test cases to make sure it works against them.

Upvotes: 1

fl00r
fl00r

Reputation: 83680

raw = "HbA1C ranging 8.0—10.0%"
raw.scan(/\d+\.\d+.+\d+\.\d+/)
#=> ["8.0\342\200\22410.0"]

Upvotes: 0

Related Questions