Reputation: 21749
I am developing e-shop where I will sell food. I want to have a suggestion box where I would suggest what else my user could buy based on what he's already have in cart. If he has beer, I want him to suggest chips and other things by descending precentage of probability that he'll buy it too. But I want that my algorithm would learn to suggest groceries based on the all users' previous purchases. Where should I start? I have groceries table user_id
, item_id
, date
and similar. How can I make a suggestion box without brute-forcing which is impossible.
Upvotes: 37
Views: 2875
Reputation: 367
Like everyone above says, the key to making this work is to implement the
'x users' also bought 'y item'
Basically what you need to do is experiment with more table rows and columns in the already existing database, or link new that would keep a statistical data about the products people view. One very important column you need is rating or like (not facebook like)
You would need new tables like:
You would also need to update existing tables with extra columns like:
If user x and user y are friends, they would have their ID's matched in Friends table. The Like table would take a product in which two users are friends and like a product z with either: (rating 0-5/0-10/0-100; like 0/1) you decide which method.
When a product is being liked/rated it would have its ID with a specific column name product rating being updated with +X or -X depending if its rating or likes. You would also need to decide on the average positive if the product is being rated or liked. An example would be 50% for rating and 100 likes for like.
With all this done, when a user x shops for products, you can match to see if:
You can do lot more than suggest products. With just a little effort, you can make hot deals to people and their friends. New product emerges on the market, and its like product z, only better. If X people and all their friends like product z, they would possibly like and buy the new product.
Upvotes: 0
Reputation: 419
I think that the best way to do that is with "tags pattern". For example:
products Table:
=================================
product_id
product_name
tags Table:
=================================
tag_id
tag_name
tags_products Table:
=================================
id_product
id_tag
products registry example:
=================================
1 | Beer
2 | Chips
3 | Cake
tags registry example:
=================================
1 | beer
2 | chips
3 | cake
tags_products registry example:
=================================
1 | 2
1 | 3
2 | 1
2 | 3
3 | 1
Then, you can relate all you want and do a query easy :)
Be happy.
Grettings.
Upvotes: 0
Reputation: 156
Humm... you are looking for a product recommendation engine then... Well, they come, basically, in three flavours:
The first one gathers and stores data on your users' activities, preferences, behavior, etc... This data is then sent into an engine that separates it into user channels. Each channel has certain characteristic likes and dislikes. So, when you have a new visitor he or she will be classified and be assiged an specific user profile. Then items will be displayed based on this profile's likes/dislikes.
Now, content-based filtering uses a different approach - a less social one - by taking into account ONLY your user's previous browsing history, his preferences and activities. Essentially, this will create recommendations based on what this user has previously liked/purchased.
But why choose just one of them, right? Hybrid recommender systems uses a bit of both to provide a personalized yet social recommendation. These are usually more accurate when it comes to providing recommendations.
I think that the collaborative filtering is a great option when you have a big influx of users - it's kinda hard to build good channels with only 42 users/month accessing your website. The second option, based on content, is better for a small site with plenty of products - however, IMHO, the third one is the one for you - build something that will get users going from the start and gather all that data they generate to, in the future, be able to offer a amazon-like recommendation experience!
Building one of these is no easy task as I'm sure you already know... but I strongly recommend this book (using a personal-history filtering!) which has really came through for me in the past: http://www.amazon.com/Algorithms-Intelligent-Web-Haralambos-Marmanis/dp/1933988665
Good luck and good learning!
Upvotes: 6
Reputation: 741
You will probably like the Non-negative Matrix Factorization Algorithm, it can do exactly what you are looking for (besides the stuff that Neville K mentioned). The database table with bought groceries will be the matrix to factorize. One factor will be a matrix that contains stuff that people bought together. This matrix will be much smaller than a matrix where you compare each grocery to all others. It would automatically find "groups" of groceries that go well together, like the Categories that Fluffeh suggestet, you would find those automatically. Steps to execute:
Someone already mentioned the Book Programming Collective Intelligence. Thats a good start.
Upvotes: 1
Reputation: 37
1 - categorize each product as 3 layered categorization (Type/function/price) as example so when a specific product selected, u can ignore all other categories this will save too much time and effort, then u can select random products from the same (Type/function/price) to throw in ur suggestions box.
this is if u don't want do dive in the hassle of theoretical machine intelligence or complex algorithms to code.
have a nice day :)
Upvotes: 0
Reputation: 1453
This is a common problem solved by Apriori Algorithm in Data Mining. You may need to create another table which maintains this statistics and then suggest based on the preferred combination
Upvotes: 9
Reputation: 29629
The thing you're describing is a recommendation engine; more specifically collaborative filtering. It's the heart of Amazon's "people who bought x also bought y" feature, and Netflix's recommendation engine.
It's a non-trivial undertaking. As in, to get anything that's even remotely useful could easily take more than building the ecommerce site in the first place.
For instance:
When I tried a similar project, it was very hard to explain to non-technical people that the computer simply didn't understand that recommending beer alongside nappies wasn't appropriate. Once we got the basic solution working, building the exclusion and edge case logic took at least as long.
Realistically, I think these are your options:
All those options are achievable in reasonable time; the problem with building a proper solution from scratch is that everyone will measure it against Amazon, and they've got a bit of a head start on you...
Upvotes: 45
Reputation: 964
Searching for a meaningful answer to your question, I came across this document:
Topic Tracking Model for Analyzing Consumer Purchase Behavior
I have read only part of the document, but it looks like it may be a theoretical answer for your question. I hope it helps.
Upvotes: 0
Reputation: 164
Make a crossell based on the shopping purchasing habits of other customers that alse bought that item. Let's say you have this purchase history in your database (orders table):
Then, if your customer has Beer on his cart, based on your customer's shopping habbits you can easily make a query and see that beer-related items are:
Then you can suggest chips and soda probably... The bigger your purchasing history the more accurate suggestions the system will make.
Upvotes: 6
Reputation: 9273
There are two basic ways to do it:
It looks like you're leaning towards the latter. I have written something like this for a site that sells various items and suggests related items based on other customers' past purchases. Here's the query I use:
SELECT items.*, COUNT( cartitems.itemid ) AS c FROM
items
LEFT JOIN cartitems ON ( cartitems.itemid = items.id )
LEFT JOIN carts ON ( carts.id = cartitems.cartid )
WHERE (
carts.id IN (
/* Every cart with this item: */
SELECT cartitems.cartid
FROM cartitems
WHERE ( cartitems.itemid = 123456 )
)
AND
( cartitems.itemid != 123456 ) /* Items other than this one */
AND
carts.checkedout = TRUE /* Carts that have checked out */
)
GROUP BY cartitems.itemid
ORDER BY c DESC
LIMIT 5
This example assumes the item they're looking at has an id of 123456. The "carts" table contains past purchases. The "cartitems" table contains individual items that were purchased in the past.
Upvotes: 0
Reputation: 33512
I think the best approach is to categorize your items and use that information to make the choice.
I did this on a grocery website and the results worked quite well. The idea is to cross group items into a number of categories.
For example, lets take a banana. It's a fruit, but it is also commonly used with cornflakes or cereal for breakfast. Cereals are also a breakfast food but certain ones might be considered health foods while others are sugary treats.
With this sort of approach, you can quickly start making a table like this:
Item | Category
-------------+------------
Banana | Breakfast
Banana | Quick
Banana | Fruit
Banana | Healthy
Museli | Breakfast
Museli | Healthy
Sugar Puffs | Breakfast
Sugar Puffs | Treat
Kiwi Fruit | Fruit
Kiwi Fruit | Healtyh
Kiwi Fruit | Dessert
Milk | Breakfast
With a simple lookup like this, you can easily find good items to suggest based on these groupings.
Lets say someone's basket contains a Banana, Museli and Sugar Puffs.
That's three breakfast items, two healthy, one not so much.
Suggest Milk as it matches all three. No impulse buy? Try again, throw in a Kiwi Fruit. and so on and so on.
The idea here is to match items across many different categories (especially ones that may not be directly apparent) and use these counts to suggest the best items for your customer.
Upvotes: 6
Reputation: 1054
You could use an artificial neural network which learns to combine different products based on previous purchases.
Here are two ressources on the topic:
http://en.wikipedia.org/wiki/Artificial_neural_network
http://www.ai-junkie.com/ann/evolved/nnt1.html
Upvotes: 0