green_ruby
green_ruby

Reputation: 21

How do I create key-value pairs by looping over subarrays in Ruby?

I'm trying to write a Ruby program which will parse the following TSV file and loop over each record, adding each shop name (last column) as the key in a hash and the associated price (second column) as the value to each corresponding key:

White Bread £1.20   Baker
Whole Milk  £0.80   Corner Shop
Gorgonzola  £10.20  Cheese Shop
Mature Cheddar  £5.20   Cheese Shop
Limburger   £6.35   Cheese Shop
Newspaper   £1.20   Corner Shop
Ilchester   £3.99   Cheese Shop

So the aim is to end up with a hash with entries in the following format: shop => price.

Here's the code I've got so far:

totals = {}

File.open("shopping.tsv") do |file|
  records = file.each_line.map { |line| line.chomp.split("\t") }
  records.each { |_, price, shop| totals[shop.to_sym] = price }
  puts(totals) 
 end

This produces an incorrect output with only some of the records parsed and added to the totals hash (and also some inconsistencies in the way the key symbols are presented):

{:Baker=>"£1.20", :"Corner Shop"=>"£1.20", :"Cheese Shop"=>"£3.99"}

Why is this happening? The output above gives the data in the desired format, but is missing most of the records. I'd eventually like to extend this program to provide totals for each shop by only adding a new hash entry if a given key doesn't already exist, but I'd like to get to the bottom of this issue before going any further.

I've spent a fair amount of time on print debugging and can confirm that the data is being correctly parsed by the file.each_line.map method, with each record being turned into a subarray containing the fields as expected. The problem appears to stem from the next line, which attempts to add just the shop and price fields to the hash.

I've also checked Stack Overflow and noticed that similar incorrect outputs often stem from attempting to iterate over an array whilst changing it, although that doesn't appear to be what I'm trying to do here (please correct me if I'm wrong). I've also experimented with using the duplicate method to create a copy of each subarray rather than trying to create the hash from the original data, but still get the same result.

I'd be grateful if someone could please enlighten me as to what is going on here.

Thanks in advance.

Upvotes: 1

Views: 123

Answers (2)

Hector Correa
Hector Correa

Reputation: 26690

I now realize you are trying to calculate the total per-shop, so your approach is very close but you'll need to update it to increase the total per key rather than replacing it on each iteration.

Below is an example of how to get this done:

item1 = {name: "White Bread", price: 1.20, shop: "Baker" }
item2 = {name: "Whole Milk", price: 0.80, shop: "Corner Shop"}
item3 = {name: "Gorgonzola", price: 10.20, shop: "Cheese Shop"}
item4 = {name: "Mature Cheddar", price: 5.20, shop: "Cheese Shop"}
item5 = {name: "Limburger", price: 6.35, shop: "Cheese Shop"}
item6 = {name: "Newspaper", price: 1.205, shop: "Corner Shop"}
item7 = {name: "Ilchester", price: 3.99, shop: "Cheese Shop"}
list = [item1, item2, item3, item4, item5, item6, item7]

totals = {}
list.each do |item|
  key = item[:shop].to_sym
  if totals[key] == nil
    # initialize the total for this shop
    totals[key] = item[:price]
  else
    # increase the previous total for this shop
    totals[key] = totals[key] + item[:price]
  end
end

puts totals

Using your original code I think the solution would be something like this:

totals = {}

File.open("shopping.tsv") do |file|
  records = file.each_line.map { |line| line.chomp.split("\t") }
  records.each do |_, price, shop|
    # convert the price to number (i.e. drop the £)
    price_num = price[1..].to_f
    if totals[shop.to_sym] == nil
      # initialize the total for this shop
      totals[shop.to_sym] = price_num
    else
      # increase the previous total for this shop
      totals[shop.to_sym] = totals[shop.to_sym] + price_num
    end 
  end
  puts(totals) 
 end

Upvotes: 2

Stefan
Stefan

Reputation: 114237

To parse a delimited file, you can utilize Ruby's CSV library. It defaults to , as the delimiter, but you can easily specify \t for tab:

require 'csv'

CSV.foreach("shopping.tsv", col_sep: "\t") do |row|
  p product: row[0], price: row[1], shop: row[2]
end

If you prefer named references, you can also specify headers:

CSV.foreach(file, col_sep: "\t", headers: [:product, :price, :shop]) do |row|
  p product: row[:product], price: row[:price], shop: row[:shop]
end

Both of the above will output:

{:product=>"White Bread", :price=>"£1.20", :shop=>"Baker"}
{:product=>"Whole Milk", :price=>"£0.80", :shop=>"Corner Shop"}
{:product=>"Gorgonzola", :price=>"£10.20", :shop=>"Cheese Shop"}
{:product=>"Mature Cheddar", :price=>"£5.20", :shop=>"Cheese Shop"}
{:product=>"Limburger", :price=>"£6.35", :shop=>"Cheese Shop"}
{:product=>"Newspaper", :price=>"£1.20", :shop=>"Corner Shop"}
{:product=>"Ilchester", :price=>"£3.99", :shop=>"Cheese Shop"}

Note that the prices are still strings. In order to calculate a sum, you have to convert them into something numerical. Since these are monetary values, I'd recommend the 3rd-party Money gem and its Monetize addition for parsing string values:

require 'money'
require 'monetize'

I18n.config.available_locales = :en
Money.locale_backend = :i18n
Money.default_currency = Money::Currency.new('GBP')

It allows you to parse monetary string values into Money instances and – once parsed – perform arithmetic operations and formatting: (among many other features)

a = Monetize.parse("£1.20")
#=> #<Money fractional:120 currency:GBP>

b = Monetize.parse("£0.80")
#=> #<Money fractional:80 currency:GBP>

c = a + b
#=> #<Money fractional:200 currency:GBP>

c.format
#=> "£2.00"

Putting CSV and Money/Monetize together:

totals = Hash.new(Money.zero)

CSV.foreach(file, col_sep: "\t", headers: [:product, :price, :shop]) do |row|
  totals[row[:shop]] += Monetize.parse(row[:price])
end

p totals.transform_values(&:format)

Note that I'm using += to actually add each price to the corresponding shop's hash entry.

Output:

{"Baker"=>"£1.20", "Corner Shop"=>"£2.00", "Cheese Shop"=>"£25.74"}

You might've wondered what totals = Hash.new(Money.zero) is for – this creates a hash with a default value of £0.00. Having a default value for each key allows us to add the price values right-away without having to worry about the initial value being nil.

Upvotes: 4

Related Questions