Reputation: 915
I've got a very simple data set
cross table: attributes relevance for a product
red blue modern old fashion (50+ Entries)
Jeans myway 0% 100% 30% 30%
Polo Shirt 100% 0% 10% 40%
(500+ Entries)
In the application you select the attribute like red, blue, .. and get the the products sorted by relevance.
What's the best way to store the data, is there a good data structure (library) for ruby?
(Don't tell me how to implement it with traditional 3 tables sql and active record. I already know how. I'm looking for a better solution.)
One way could be with a Hash:
"red" => {"Jeans myway" => 0, "Polo Shirt" => 100}, "blue" => {..
is this a good way and how should I store it to file?
[edit] Better solution: If I take a relational db, I've to split the matrix into 3 tables products, attributes, attributes_products. I would like to store it in one table / matrix and search / use it like a matrix.
E.g. I want to select products where the attributes 'old fashion', 'modern' are relevant (>0) sorted by relevance would return 'Jeans myway 0.09', 'Polo Shirt 0.04'. (Relevance is calculated by multiplication.)
Upvotes: 1
Views: 225
Reputation: 303261
Here's a solution tailored for your needs. It allows you to select an arbitrary number of products or attributes and see the values in the other axis sorted by the product of the weightings. It lets you store your data in CSV, so you can just keep a big Excel file of your grid for future tweaking.
The merge_by
, sort_by
, and filter_by
methods let you specify the block to be applied when getting your results.
TESTDATA = <<ENDCSV
,red,blue,modern,old fashion,sexy
Jeans myway,0%,100%,30%,30%,70%
Polo Shirt,100%,0%,10%,40%,1%
Bra,100%,0%,100%,0%,100%
ENDCSV
def test
products = RelationTable.load_from_csv( TESTDATA )
p products.find( :col, 'old fashion','modern' )
#=> [["Jeans myway", 9.0], ["Polo Shirt", 4.0]]
p products.find( :row, 'Polo Shirt' )
#=> [["red", 100.0], ["old fashion", 40.0], ["modern", 10.0]]
p products.find( :col, 'sexy' )
#=> [["Bra", 100.0], ["Jeans myway", 70.0], ["Polo Shirt", 1.0]]
p products.find( :row, 'Polo Shirt','Bra' )
#=> [["red", 100.0], ["modern", 10.0]]
p products.find( :col, 'sexy','modern' )
#=> [["Bra", 100.0], ["Jeans myway", 21.0], ["Polo Shirt", 0.1]]
p products.find( :col, 'red', 'blue' )
#=> []
p products.find( :col, 'bogus' )
#=> []
end
class RelationTable
def self.load_from_csv( csv )
require 'csv'
data = CSV.parse(csv)
self.new( data.shift[1..-1], data.map{ |r| r.shift }, data )
end
def initialize( col_names=[], row_names=[], weights=[] )
@by_col = Hash.new{|h,k|h[k]=Hash.new(0)}
@by_row = Hash.new{|h,k|h[k]=Hash.new(0)}
row_names.each_with_index do |row,r|
col_names.each_with_index do |col,c|
@by_col[col][row] = @by_row[row][col] = weights[r][c].to_f
end
end
# Multiply all weights, sort by weight (descending), only include non-zero
merge_by{ |values| values.inject(1.0){ |weight,v| weight*v/100 }*100 }
sort_by{ |key,value| [-value,key] }
filter_by{ |key,value| value > 0 }
end
def merge_by(&proc); @merge = proc; end
def sort_by(&proc); @sort = proc; end
def filter_by(&proc); @filter = proc; end
def find( row_or_col, *names )
axis = (row_or_col == :row) ? @by_row : @by_col
merge(axis.values_at(*names)).select(&@filter).sort_by(&@sort)
end
private
# Turn an array of hashes into a hash of arrays of values,
# and then merge the values using the merge_by proc
def merge( hashes )
if hashes.length==1
hashes.first # Speed optimization; ignores the merge_by block
else
result = Hash.new{|h,k|h[k]=[]}
hashes.each{ |h| h.each{ |k,v| result[k] << v } }
result.each{ |k,values| result[k] = @merge[values] }
result
end
end
end
test if __FILE__==$0
Upvotes: 2