vij555
vij555

Reputation: 359

mathematical model to build a ranking/ scoring system

I want to rank a set of sellers. Each seller is defined by parameters var1,var2,var3,var4...var20. I want to score each of the sellers.

Currently I am calculating score by assigning weights on these parameters(Say 10% to var1, 20 % to var2 and so on), and these weights are determined based on my gut feeling.

my score equation looks like

score = w1* var1 +w2* var2+...+w20*var20
score  = 0.1*var1+ 0.5 *var2 + .05*var3+........+0.0001*var20

My score equation could also look like

score = w1^2* var1 +w2* var2+...+w20^5*var20

where var1,var2,..var20 are normalized.

Which equation should I use? What are the methods to scientifically determine, what weights to assign?

I want to optimize these weights to revamp the scoring mechanism using some data oriented approach to achieve a more relevant score.

example

I have following features for sellers

1] Order fulfillment rates [numeric]

2] Order cancel rate [numeric]

3] User rating [1-5] { 1-2 : Worst, 3: Average , 5: Good} [categorical]

4] Time taken to confirm the order. (shorter the time taken better is the seller) [numeric]

5] Price competitiveness

Are there better algorithms/approaches to solve this problem? calculating score? i.e I linearly added the various features, I want to know better approach to build the ranking system?

How to come with the values for the weights?

Apart from using above features, few more that I can think of are ratio of positive to negative reviews, rate of damaged goods etc. How will these fit into my Score equation?

Upvotes: 3

Views: 2431

Answers (1)

Felix Castor
Felix Castor

Reputation: 1675

Unfortunately stackoverflow doesn't have latex so images will have to do:

Also as a disclaimer, I don't think this is a concise answer but your question is quite broad. This has not been tested but is an approach I would most likely take given a similar problem.

As a possible direction to go, below is the multivariate gaussian. The idea would be that each parameter is in its own dimension and therefore could be weighted by importance. Example:

Sigma = [1,0,0;0,2,0;0,0,3] for a vector [x1,x2,x3] the x1 would have the greatest importance.
  1. The co-variance matrix Sigma takes care of scaling in each dimension. To achieve this simply add the weights to a diagonal matrix nxn to the diagonal elements. You are not really concerned with the cross terms.
  2. Mu is the average of all logs in your data for your sellers and is a vector.
  3. xis the mean of every category for a particular seller and is as a vector x = {x1,x2,x3...,xn}. This is a continuously updated value as more data are collected.
  4. The parameters of the the function based on the total dataset should evolve as well. That way biased voting especially in the "feelings" based categories can be weeded out.

enter image description here

After that setup the evaluation of the function f_x can be played with to give the desired results. This is a probability density function, but its utility is not restricted to stats.

Upvotes: 2

Related Questions