ruby life questions
ruby life questions

Reputation: 129

Regex strings in Ruby

Input strings:

str1 = "$13.90 Price as Shown"
str2 = "$590.50  $490.00 Price as Selected" 
str3 = "$9.90 or 5/$27.50 Price as Selected"

Output strings:

str1 = "13.90"
str2 = "490.00"
str3 = "9.90"

My code to get output:

str = str.strip.gsub(/\s\w{2}\s\d\/\W\d+.\d+/, "") # remove or 5/$27.50 from string
str = /\W\d+.\d+\s\w+/.match(str).to_s.gsub("$", "").gsub(" Price", "")

This code works fine for all 3 different types of strings. But how can I improve my code? Are there any better solutions? Also guys can you give link to good regex guide/book?

Upvotes: 0

Views: 96

Answers (5)

Cary Swoveland
Cary Swoveland

Reputation: 110645

Assuming you simply want the smallest dollar value in each line:

r = /
    \$    # match a dollar sign
    \d+   # match one or more digits
    \.    # match a decimal point
    \d{2} # match two digits
    /x    # extended mode

[str1, str2, str3].map { |s| s.scan(r).min_by { |s| s[1..-1].to_f } }
  #=> ["$13.90", "$490.00", "$9.90"]

Actually, you don't have to use a regex. You could do it like this:

def smallest(str)
  val = str.each_char.with_index(1).
        select { |c,_| c == ?$ }.
        map { |_,i| str[i..-1].to_f }.
        min
  "$%.2f" % val
end

smallest(str1) #=> "$13.90" 
smallest(str2) #=> "$490.00" 
smallest(str3) #=> "$9.90" 

Upvotes: 0

pguardiario
pguardiario

Reputation: 54984

A better regex is probably: /\B\$(\d+\.\d{2})\b/

str = "$590.50  $490.00 Price as Selected"
str.scan(/\B\$(\d+\.\d{2})\b/).flatten.min_by(&:to_f)
#=> "490.00"

Upvotes: 0

Pedro Lobito
Pedro Lobito

Reputation: 98861

My code works fine for all 3 types of strings. Just wondering how can I improve that code

str = str.gsub(/ or \d\/[\$\d.]+/i, '')
str = /(\$[\d.]+) P/.match(str)

Ruby Live Demo

http://ideone.com/18XMjr

Upvotes: 0

Jordan Running
Jordan Running

Reputation: 106017

Supposing input can be relied upon to look like one of your three examples, how about this?

expr = /\$(\d+\.\d\d)\s+(?:or\s+\d+\/\$\d+\.\d\d\s+)?Price/

str = "$9.90 or 5/$27.50 Price as Selected"
str[expr, 1] # => "9.90"

Here it is on Rubular: http://rubular.com/r/CakoUt5Lo3

Explained:

expr = %r{
  \$          # literal dollar sign
  (\d+\.\d\d) # capture a price with two decimal places (assume no thousands separator)
  \s+         # whitespace
  (?:         # non-capturing group
    or\s+       # literal "or" followed by whitespace
    \d+\/       # one or more digits followed by literal "/"
    \$\d+\.\d\d # dollar sign and price
    \s+         # whitespace
  )?          # preceding group is optional
  Price       # the literal word "Price"
}x

You might use it like this:

MATCH_PRICE_EXPR = /\$(\d+\.\d\d)\s+(?:or\s+\d+\/\$\d+\.\d\d\s+)?Price/

def match_price(input)
  return unless input =~ MATCH_PRICE_EXPR
  $1.to_f
end

puts match_price("$13.90 Price as Shown")
# => 13.9

puts match_price("$590.50  $490.00 Price as Selected")
# => 490.0

puts match_price("$9.90 or 5/$27.50 Price as Selected")
# => 9.9

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

A regex I suggested first is just a sum total of your regexps:

(?<=(?<!\/)\$)\d+.\d+(?=\s\w+)

See demo

Since it is next to impossible to compare numbers with regex, I suggest

  1. Extracting all float numbers
  2. Parse them as float values
  3. Get the minimum one

Here is a working snippet:

def getLowestNumberFromString(input)
    arr = input.scan(/(?<=(?<!\/)\$)\d+(?:\.\d+)?/)
    arr.collect do |value| 
        value.to_f 
    end
    return arr.min
end

puts getLowestNumberFromString("$13.90 Price as Shown")
puts getLowestNumberFromString("$590.50  $490.00 Price as Selected")
puts getLowestNumberFromString("$9.90 or 5/$27.50 Price as Selected")

The regex breakdown:

  • (?<=(?<!\/)\$) - assert that there is a $ symbol not preceded with / right before...
  • \d+ - 1 or more digits
  • (?:\.\d+)? - optionally followed with a . followed by 1 or more digits

Note that if you only need to match floats with decimal part, remove the ? and non-capturing group from the last subpattern (/(?<=(?<!\/)\$)\d+\.\d+/ or even /(?<=(?<!\/)\$)\d*\.?\d+/).

Upvotes: 2

Related Questions