Reputation: 669
I have a string
str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."
Expected result: I wanted to split this in a hash like this:
hash = {
race_1 => [650, 215, 265, 315],
race_2 => [165, 215, 265, 315]
}
Can someone please guide me in a direction to create the matching hash?
Upvotes: 1
Views: 245
Reputation: 529
Is this the expected output?
require 'yaml'
str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."
races = YAML.load(str)
races.each { |k, v| races[k] = v.scan(/\d+/).map(&:to_i) }
pp races
produces
{
"race_1" => [650, 215, 265, 315],
"race_2" => [165, 215, 265, 315]
}
Upvotes: 0
Reputation: 110675
The following allows any number of races and for each race to have any number of associated distances (in str
below there are four).
str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m"
str.gsub(/(\w+): ((?:\d+m, *)*\d+)/).with_object({}) do |_s,h|
h[$1] = $2.split(',').map(&:to_i)
end
#=> {"race_1"=>[650, 215, 265, 315],
# "race_2"=>[165, 215, 265, 315]}
This employs a little-used (and greatly undervalued) form of String#gsub that takes a single argument but no block, and returns an enumerator. The enumerator merely generates matches of gsub
's argument and therefore has nothing to do with string replacement. This form of gsub
is sometimes a convenient replacement for String#scan when scan
's argument is a regular expression that contains one or more capture groups.
The regular expression that is gsub
's argument can be expressed in free-spacing mode to make it self-documenting.
/
( # begin capture group 1
\w+ # match >= 1 word characters
) # end capture group 1
: # match a colon
[ ] # match a space
( # begin capture group 2
(?: # begin non-capture group
\d+ # match >= 1 digits
m,[ ]* # match "m," followed by >= 0 spaces
) # end non-capture group
* # execute preceding non-capture group >= 0 times
\d+ # match >= 1 digits
) # end capture group 2
/x # invoke free-spacing regex definition mode
Note that in free-spacing mode spaces that are part of the expression must be protected. There are various ways of doing that. I have enclosed each space in a character class ([ ]
).
In the example above we compute the following enumerator.
enum = str.gsub(/(\w+): ((?:\d+m, *)*\d+)/)
#=> #<Enumerator: "race_1: 650m, 215m, 265m, 315m\r\n
# race_2: 165m, 215m, 265m, 315m":
# gsub(/(\w+): ((?:\d+m, *)*\d+)/)>
The elements it will generate are as follows.
enum.next
#=> "race_1: 650m, 215m, 265m, 315"
enum.next
#=> "race_2: 165m, 215m, 265m, 315"
enum.next
#=> StopIteration: iteration reached an end
Note also that
arr = "650m, 215m, 265m, 315".split(',')
#=> ["650m", " 215m", " 265m", " 315"]
arr.map(&:to_i)
#=> [650, 215, 265, 315]
A variant of this is to write
rgx = /\w+: (?:\d+m, *)*\d+/
str.gsub(rgx).with_object({}) do |s,h|
key, value = s.split(':')
h[key] = value.split(',').map(&:to_i)
end
#=> {"race_1"=>[650, 215, 265, 315],
# "race_2"=>[165, 215, 265, 315]}
As the regular expression now has no capture groups we get the same result when the first line is replaced with
str.scan(rgx).each_with_object({}) do |s,h|
Upvotes: 2
Reputation: 6064
You can write this code
Input
str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."
Code
Split the code with colon : and replace the m at the end
hash = str.scan(/(race_\d+): (.*)/).each_with_object({}) do |(race, distances), hash|
hash["#{race}"] = distances.split(', ').map { |d| d.sub(/m$/, '').to_i }
end
p hash
Output
{"race_1"=>[650, 215, 265, 315], "race_2"=>[165, 215, 265, 315]}
Upvotes: 2
Reputation: 106802
When the input always follows the same pattern, then I would use String#scan
with a Regexp to extract the significant values.
string = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."
regexp = /(race_\d+).*?(\d+(?=m)).*?(\d+(?=m)).*?(\d+(?=m)).*?(\d+(?=m))/
string.scan(regexp)
#=> [["race_1", "650", "215", "265", "315"], ["race_2", "165", "215", "265", "315"]]
These nested array of values can then be transformed into an hash like this:
string.scan(regexp).to_h { |values| [values[0], values[1..-1]] }
#=> {"race_1"=>["650", "215", "265", "315"], "race_2"=>["165", "215", "265", "315"]}
And because you want the numbers in the array to be integers:
string.scan(regexp).to_h { |values| [values[0], values[1..-1].map(&:to_i)] }
#=> {"race_1"=>[650, 215, 265, 315], "race_2"=>[165, 215, 265, 315]}
Upvotes: 4
Reputation: 329
Could you try the code below?
str = "race_1: 650m, 215m, 265m, 315m\r\nrace_2: 165m, 215m, 265m, 315m."
rows = str.delete('.').split("\r\n") # => ["race_1: 650m, 215m, 265m, 315m", "race_2: 165m, 215m, 265m, 315m"]
hash_result = {}
rows.each do |row|
key = row.split(':').first # => race_1
value = row.split(':').last.split('m, ').map(&:to_i) # => [650, 215, 265, 315]
hash_result[key.to_sym] = value
end
# hash_result = {:race_1=>[650, 215, 265, 315], :race_2=>[165, 215, 265, 315]}
p/s: I think you should do it yourself to improve yourself
Upvotes: 0