Paul Mouzas
Paul Mouzas

Reputation: 53

Approximating Price

I have a set of products. Each product is a variation of a non existent “parent”. Also, each product (let’s call them child products) has its own individually assigned price in our database. Here is a small example set.

Parent SKU is 1000.

Product Children are:

1000-TankTop-SM - 14.95
1000-TankTop-2X - 17.95
1000-Hoodie-SM - 34.95
1000-Hooodie-2X - 39.95

Here is the problem. Our database lists each real child product price (as directly above) in a one-to-one relationship. Each product has a SKU and I can look up the price of each product by SKU. I have a website that that cannot support this method of pricing. The way pricing works is this. I create a “parent” product. Each parent product must have a base price. The prices of variations are created from adding or subtracting a dollar amount. So a “parent” has two attribute sets, product type and size. A plus or minus amount must be associated with each attribute. So from my example above we have.

Sizes:

SM +- ?
2X +- ?

Product Types:

TankTop +- ?
Hoodie += ?

How can I decide what the variables above should equal to at least approximate the actual child product prices? Is this possible without any extreme outliers?

Upvotes: 1

Views: 83

Answers (1)

Cloud
Cloud

Reputation: 19333

This sounds like a frustrating (ie: crummy) database system, since it's effectively impossible to create certain arbitrary prices. ie:

TankTop = + $2.00
Shirt   = + $1.00
Sweat   = + $5.00

Small   = - $1.00
Medium  = + $0.00
Large   = + $3.00
X-Large = + $5.00

With the above example, it would be impossible to have a Small Shirt cost $10.00 while simultaneously having a Medium Shirt cost $10.50.

So, each product has a price defined as a sum of: BASE_SKU_PRICE + SIZE_MODIFIER + STYLE_MODIFIER. This means that you cannot assign an arbitrary price value to each unique item, so you'll need to use a regression model.

If you want to re-adjust the price for a massive table of items, the easiest approach to minimize outliers would be a multivariate variation of linear least mean square errors approximation (LMS), which is just another type of multivariate linear regression approach.

This will allow you to model each unique item (ie: SKU) as a function of:

y = a + bX_1 + cX_2

If you want a very tidy approach to handling this for a production database system, you would be best off just using MATLAB or SPSS to create your database table, as you can specify confidence intervals, and other parameters to help optimize your approximation.

Finally, I found an example online which you could try out in OpenOffice Calc or Microsoft Excel. This will give you a working algorithmic approach rather than you having to derive the analytical form equations and generate code from them. It might even be enough to solve your problem without having to break out MATLAB or SPSS.

Upvotes: 2

Related Questions