Using Regex and ruby regular expressions to find values

Question

So I'm currently trying to sort values from a file. I'm stuck on the finding the first attribute, and am not sure why. I'm new to regex and ruby so I'm not sure how to go about the problem. I'm trying to find values of a,b,c,d,e where they are all positive numbers.

Here's what the line will look like

length= begin=(,) end=(,)

Here's what I'm using to find the values

current_line = file.gets if current_line == nil then return end while current_line = file.gets do if line =~ /length=<(\d+)> begin=((\d+),(\d+)) end=((\d+),(\d+))/ length, begin_x, begin_y, end_x, end_y = $1, $2, $3, $4, $5 puts("length:" + length.to_s + " begin:" + begin_x.to_s + "," + begin_y.to_s + " end:" + end_x.to_s + "," + end_y.to_s) end end

for some reason it never prints anything out, so I'm assuming it never finds a match

Sample input length=4 begin=(0,0) end=(3,0)

A line with 0-4 decimals after 2 integers seperated by commas. So it could be any of these:

2 4 1.3434324,3.543243,4.525324 1 2 18 3.3213,9.3233,1.12231,2.5435 7 9 2.2,1.899990 0 3 2.323

Cary Swoveland · Accepted Answer

Here is your regex:

r = /length=<(\d+)> begin=((\d+),(\d+)) end=((\d+),(\d+))/
str.scan(r)
  #=> nil

First, we need to escape the parenthesis:

r = /length=<(\d+)> begin=$(\d+),(\d+)$ end=$(\d+),(\d+)$/

Next, add the missing < and > after "begin" and "end".

r = /length=<(\d+)> begin=$<(\d+)>,<(\d+)>$ end=$<(\d+)>,<(\d+)>$/

Now let's try it:

str = "length=<4779> begin=(<21>,<47>) end=(<356>,<17>)"

but first, let's set the mood

str.scan(r)
  #=> [["4779", "21", "47", "356", "17"]]

Success!

Lastly (though probably not necessary), we might replace the single spaces with \s+, which permits one or more spaces:

r = /length=<(\d+)>\s+begin=$<(\d+)>,<(\d+)>$\send=$<(\d+)>,<(\d+)>$/

Addendum

The OP has asked how this would be modified if some of the numeric values were floats. I do not understand precisely what has been requested, but the following could be modified as required. I've assumed all the numbers are non-negative. I've also illustrated one way to "build" a regex, using Regexp#new.

  s1 = '<(\d+(?:\.\d+)?)>' # note single parens
    #=> "<(\d+(?:\.\d+)?)>" 
  s2 = "=$#{s1},#{s1}$"
    #=> "=$<(\d+(?:\.\d+)?)>,<(\d+(?:\.\d+)?)>$" 
  r = Regexp.new("length=#{s1} begin#{s2} end#{s2}")
    #=> /length=<(\d+(?:\.\d+)?)> begin=$<(\d+(?:\.\d+)?)>,<(\d+(?:\.\d+)?)>$ end=$<(\d+(?:\.\d+)?)>,<(\d+(?:\.\d+)?)>$/ 

  str = "length=<47.79> begin=(<21>,<4.7>) end=(<0.356>,<17.999>)" 

  str.scan(r)
    #=> [["47.79", "21", "4.7", "0.356", "17.999"]]

Using Regex and ruby regular expressions to find values

Answers (2)

Related Questions