Beffa
Beffa

Reputation: 915

Is there a database or a data structure for Ruby to implement relation matrices?

I've got a very simple data set

In the application you select the attribute like red, blue, .. and get the the products sorted by relevance.

What's the best way to store the data, is there a good data structure (library) for ruby?

(Don't tell me how to implement it with traditional 3 tables sql and active record. I already know how. I'm looking for a better solution.)

One way could be with a Hash:

"red" => {"Jeans myway" => 0, "Polo Shirt" => 100}, "blue" => {..

is this a good way and how should I store it to file?

[edit] Better solution: If I take a relational db, I've to split the matrix into 3 tables products, attributes, attributes_products. I would like to store it in one table / matrix and search / use it like a matrix.

E.g. I want to select products where the attributes 'old fashion', 'modern' are relevant (>0) sorted by relevance would return 'Jeans myway 0.09', 'Polo Shirt 0.04'. (Relevance is calculated by multiplication.)

Upvotes: 1

Views: 225

Answers (1)

Phrogz
Phrogz

Reputation: 303261

Here's a solution tailored for your needs. It allows you to select an arbitrary number of products or attributes and see the values in the other axis sorted by the product of the weightings. It lets you store your data in CSV, so you can just keep a big Excel file of your grid for future tweaking.

The merge_by, sort_by, and filter_by methods let you specify the block to be applied when getting your results.

TESTDATA = <<ENDCSV
,red,blue,modern,old fashion,sexy
Jeans myway,0%,100%,30%,30%,70%
Polo Shirt,100%,0%,10%,40%,1%
Bra,100%,0%,100%,0%,100%
ENDCSV

def test
  products = RelationTable.load_from_csv( TESTDATA )

  p products.find( :col, 'old fashion','modern' )
  #=> [["Jeans myway", 9.0], ["Polo Shirt", 4.0]]

  p products.find( :row, 'Polo Shirt' )
  #=> [["red", 100.0], ["old fashion", 40.0], ["modern", 10.0]]

  p products.find( :col, 'sexy' )
  #=> [["Bra", 100.0], ["Jeans myway", 70.0], ["Polo Shirt", 1.0]]

  p products.find( :row, 'Polo Shirt','Bra' )
  #=> [["red", 100.0], ["modern", 10.0]]

  p products.find( :col, 'sexy','modern' )
  #=> [["Bra", 100.0], ["Jeans myway", 21.0], ["Polo Shirt", 0.1]]

  p products.find( :col, 'red', 'blue' )
  #=> []

  p products.find( :col, 'bogus' )
  #=> []
end

class RelationTable
  def self.load_from_csv( csv )
    require 'csv'
    data = CSV.parse(csv)
    self.new( data.shift[1..-1], data.map{ |r| r.shift }, data )
  end
  def initialize( col_names=[], row_names=[], weights=[] )
    @by_col = Hash.new{|h,k|h[k]=Hash.new(0)}
    @by_row = Hash.new{|h,k|h[k]=Hash.new(0)}
    row_names.each_with_index do |row,r|
      col_names.each_with_index do |col,c|
        @by_col[col][row] = @by_row[row][col] = weights[r][c].to_f
      end
    end
    # Multiply all weights, sort by weight (descending), only include non-zero
    merge_by{ |values| values.inject(1.0){ |weight,v| weight*v/100 }*100 }
    sort_by{ |key,value| [-value,key] }
    filter_by{ |key,value| value > 0 }
  end
  def merge_by(&proc);  @merge  = proc; end
  def sort_by(&proc);   @sort   = proc; end
  def filter_by(&proc); @filter = proc; end
  def find( row_or_col, *names )
    axis = (row_or_col == :row) ? @by_row : @by_col
    merge(axis.values_at(*names)).select(&@filter).sort_by(&@sort)
  end
  private
    # Turn an array of hashes into a hash of arrays of values,
    # and then merge the values using the merge_by proc
    def merge( hashes )
      if hashes.length==1
        hashes.first # Speed optimization; ignores the merge_by block
      else
        result = Hash.new{|h,k|h[k]=[]}
        hashes.each{ |h| h.each{ |k,v| result[k] << v } }
        result.each{ |k,values| result[k] = @merge[values] }
        result
       end
    end
end

test if __FILE__==$0

Upvotes: 2

Related Questions