Manav
Manav

Reputation: 669

Ruby: split string in hash

I have a string

str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."

Expected result: I wanted to split this in a hash like this:

hash = {
   race_1 => [650, 215, 265, 315],
   race_2 => [165, 215, 265, 315]
}

Can someone please guide me in a direction to create the matching hash?

Upvotes: 1

Views: 245

Answers (5)

rubycademy.com
rubycademy.com

Reputation: 529

Is this the expected output?

require 'yaml'

str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."

races = YAML.load(str)

races.each { |k, v| races[k] = v.scan(/\d+/).map(&:to_i) }

pp races

produces

{
   "race_1" => [650, 215, 265, 315],
   "race_2" => [165, 215, 265, 315]
}

Upvotes: 0

Cary Swoveland
Cary Swoveland

Reputation: 110675

The following allows any number of races and for each race to have any number of associated distances (in str below there are four).

str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m"
str.gsub(/(\w+): ((?:\d+m, *)*\d+)/).with_object({}) do |_s,h|
  h[$1] = $2.split(',').map(&:to_i)
end
  #=> {"race_1"=>[650, 215, 265, 315],
  #    "race_2"=>[165, 215, 265, 315]}

This employs a little-used (and greatly undervalued) form of String#gsub that takes a single argument but no block, and returns an enumerator. The enumerator merely generates matches of gsub's argument and therefore has nothing to do with string replacement. This form of gsub is sometimes a convenient replacement for String#scan when scan's argument is a regular expression that contains one or more capture groups.

The regular expression that is gsub's argument can be expressed in free-spacing mode to make it self-documenting.

/
(          # begin capture group 1
  \w+      # match >= 1 word characters
)          # end capture group 1
:          # match a colon
[ ]        # match a space
(          # begin capture group 2
  (?:      # begin non-capture group
    \d+    # match >= 1 digits
    m,[ ]* # match "m," followed by >= 0 spaces
  )        # end non-capture group
  *        # execute preceding non-capture group >= 0 times
  \d+      # match >= 1 digits
)          # end capture group 2
/x         # invoke free-spacing regex definition mode

Note that in free-spacing mode spaces that are part of the expression must be protected. There are various ways of doing that. I have enclosed each space in a character class ([ ]).


In the example above we compute the following enumerator.

enum = str.gsub(/(\w+): ((?:\d+m, *)*\d+)/)
  #=> #<Enumerator: "race_1: 650m, 215m, 265m, 315m\r\n
  #     race_2: 165m, 215m, 265m, 315m":
  #     gsub(/(\w+): ((?:\d+m, *)*\d+)/)>

The elements it will generate are as follows.

enum.next
  #=> "race_1: 650m, 215m, 265m, 315"
enum.next
  #=> "race_2: 165m, 215m, 265m, 315"
enum.next
  #=> StopIteration: iteration reached an end

Note also that

arr = "650m, 215m, 265m, 315".split(',')
  #=> ["650m", " 215m", " 265m", " 315"]

arr.map(&:to_i)
  #=> [650, 215, 265, 315]

A variant of this is to write

rgx = /\w+: (?:\d+m, *)*\d+/

str.gsub(rgx).with_object({}) do |s,h|
  key, value = s.split(':')
  h[key] = value.split(',').map(&:to_i)
end
  #=> {"race_1"=>[650, 215, 265, 315],
  #    "race_2"=>[165, 215, 265, 315]}

As the regular expression now has no capture groups we get the same result when the first line is replaced with

str.scan(rgx).each_with_object({}) do |s,h|

Upvotes: 2

Rajagopalan
Rajagopalan

Reputation: 6064

You can write this code

Input

str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."

Code

Split the code with colon : and replace the m at the end

hash = str.scan(/(race_\d+): (.*)/).each_with_object({}) do |(race, distances), hash|
  hash["#{race}"] = distances.split(', ').map { |d| d.sub(/m$/, '').to_i }
end
p hash

Output

{"race_1"=>[650, 215, 265, 315], "race_2"=>[165, 215, 265, 315]}

Upvotes: 2

spickermann
spickermann

Reputation: 106802

When the input always follows the same pattern, then I would use String#scan with a Regexp to extract the significant values.

string = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."
regexp = /(race_\d+).*?(\d+(?=m)).*?(\d+(?=m)).*?(\d+(?=m)).*?(\d+(?=m))/

string.scan(regexp)
#=> [["race_1", "650", "215", "265", "315"], ["race_2", "165", "215", "265", "315"]]

These nested array of values can then be transformed into an hash like this:

string.scan(regexp).to_h { |values| [values[0], values[1..-1]] }
#=> {"race_1"=>["650", "215", "265", "315"], "race_2"=>["165", "215", "265", "315"]}

And because you want the numbers in the array to be integers:

string.scan(regexp).to_h { |values| [values[0], values[1..-1].map(&:to_i)] }
#=> {"race_1"=>[650, 215, 265, 315], "race_2"=>[165, 215, 265, 315]}

Upvotes: 4

Lee Drum
Lee Drum

Reputation: 329

Could you try the code below?

str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."
rows = str.delete('.').split("\r\n") # => ["race_1: 650m, 215m, 265m, 315m", "race_2: 165m, 215m, 265m, 315m"] 
hash_result = {}
rows.each do |row|
  key = row.split(':').first # => race_1
  value = row.split(':').last.split('m, ').map(&:to_i) # => [650, 215, 265, 315]
  hash_result[key.to_sym] = value
end
# hash_result = {:race_1=>[650, 215, 265, 315], :race_2=>[165, 215, 265, 315]}

p/s: I think you should do it yourself to improve yourself

Upvotes: 0

Related Questions