How to compare numerical values in a string and display one of them?

Question

I have a data dump, of which the following is one row of it:

{,lat:26.3832456,distance:678.4075116373302,lon:120.4731951,address:tourism:viewpoint,},{,lat:26.3830149,distance:622.2862561842148,lon:120.473753,address:name:xe7,xbe,x85,xe6,xbc,xa2,xe5,x9d,xaa,tourism:viewpoint,},{,lat:26.3833609,distance:363.7364243757184,lon:120.4763708,address:name:xe5,x9c,x8b,xe4,xb9,x8b,xe5,x8c,x97,xe7,x96,x86,tourism:viewpoint,},{,lat:26.3823648,distance:223.60523114628876,lon:120.4821298,address:name:xe5,x90,x8e,xe6,xbe,xb3,natural:bay,},{,lat:26.3788243,distance:470.02293394005875,lon:120.480733,address:name:xe5,x90,x8e,xe6,xbe,xb3,xe5,xb1,xb1,source:GNS,natural:peak,},{,lat:26.3750042,distance:893.4290785528082,lon:120.4808826,address:name:xe8,x93,xae,xe8,x8a,xb1,xe5,x9c,x92,source:GNS,natural:peak,},{,lat:26.3763331,distance:742.92090763674,lon:120.4795115,address:name:xe8,xa5,xbf,xe5,xbc,x95,xe5,xb3,xb6,place:hamlet,source:GNS,},{,lat:26.378645,distance:623.327734488774,lon:120.4839399,address:source:PGS,natural:coastline,},{,lat:26.3801244,distance:418.6308872217763,lon:120.4772875,address:highway:residential,},{,lat:26.3791422,distance:434.6736862343828,lon:120.4792953,address:highway:residential,},{,lat:26.3779802,distance:739.2129423740619,lon:120.4751349,address:highway:unclassified,},{,lat:26.3770924,distance:675.0424314750977,lon:120.4815607,address:highway:residential,},{,lat:26.3760869,distance:798.0261247167285,lon:120.4821517,address:highway:path,},{,lat:26.3766434,distance:737.1372670528466,lon:120.4821003,address:highway:path,},{,lat:26.3813278,distance:384.84440601318613,lon:120.4766175,address:highway:path,},{,lat:26.3755092,distance:833.3985359252805,lon:120.4802778,address:highway:road,},{,lat:26.3785345,distance:496.6253230490143,lon:120.4799081,address:highway:road,}

The part within each pair of braces (i.e., "{...}") represents information about one identity. I need to compare the distance field of each pair of braces, and then display the content of the braces with the least distance. For instance, in the example of the above row, I want to output the following:

{,lat:26.3823648,distance:223.60523114628876,lon:120.4821298,address:name:xe5,x90,x8e,xe6,xbe,xb3,natural:bay,}

as this is the one with the least value of the distance field.

How to do this? I have written the following code to only extract all the distances to compare them, but even that does not work:

require 'rubygems'
require 'mechanize'
require 'csv'    
CSV.open('Output.csv', "wb") do |csv|
    CSV.foreach('Original.csv', :headers=>true) do |row|
        vector = row.split(",")    
        dist = vector.match("^.*\/distance:\/(.*)\/")    
        csv << dist
    end
end

My idea was to extract all the distances, compare them, find the smallest, go back to the original string to locate the braces with that particular distance, and then output the content in those braces. But this seems like a kind of convoluted way of doing this. Is there a more elegant way to output the brace with the smallest distance? Thanks.

eugen · Accepted Answer

Not very elegant, but it seems to work:

s.scan(/\{[^{}]*\}/).min_by { |r| r =~ /distance:(.*),/; $1.to_f }

where s would be your initial data dump as a string.

scan splits the initial data into an array of records (anything between pairs of braces which is not a brace is considered part of a record). min_by loops through that array looking for the record which has a minimum value given by the block passed as a parameter - in this case the block is just a regex match looking for the distance value in the record.

How to compare numerical values in a string and display one of them?

Answers (2)

Related Questions