Sovos
Sovos

Reputation: 3390

Neo4j and Cluster Analysys

I'm developing a web application that will heavily depend on its ability to make suggestions on items basing on users with similar preferences. A friend of mine told me that what I'm looking for - mathematically - is some Cluster Analysis algorithm. On the other hand, here on SO, I was told that Neo4j (or some other Graph DB) was the kind DB that I should have approached for this task (the preferences one).

I started studying both this tools, and I'm having some doubts. For Cluster Analysis purposes it looks to me that a standard SQL DB would still be the perfect choice, while Neo4j would be better suited for a Neural Network kind of approach (although still perfectly fit for the task).

Am I missing something? Am I trying to use the wrong tools combination?

I would love to hear some ideas on the subject.

Thanks for sharing

Upvotes: 2

Views: 4362

Answers (3)

ulkas
ulkas

Reputation: 5918

this depends on your data. neo4j is capable to provide even complex recommendations in real-time for one particular node - let's say you want to recommend to a user some product and this can be handle within a graph db in real-time

whereas using some clustering system is the best way to do recommendations for all users at once (and than maybe save it somewhere so you wouldn't need to calculate it again).

the computational difference:

  • neo4j has has no initialization cost and can give you one recommendations in an acceptable time
  • clustering needs more time for initialization (e.g. not in seconds but most likely in minutes/hours) and is better to calculate the recommendations for the whole dataset. in fact, taking strictly the time for one calculations for a specific user this clustering can do it faster than neo4j but the big restriction is the initial initialization - thus not good for real-time application

the practical difference:

  • if you have mostly static data and is ok for you to do recommendations once in a time than do clustering with SQL

  • if you got dynamical data where the data are being updated with each interaction and is necessary for you to always provide the newest recommendation, than use neo4j

Upvotes: 6

Alessandro Negro
Alessandro Negro

Reputation: 517

let me introduce Reco4J (http://www.reco4j.org), is is an open source framework that provide recommendation based on graph database source. It uses neo4j as graph database management system. Have a look at it and contact us if you are interested in support. It is in a really early release but we are working hard to provide extended documentation and new interesting features.

Cheers, Alessandro

Upvotes: 2

bendaizer
bendaizer

Reputation: 1235

I am currently working on various topics related to recommendation and clustering with neo4j. I'm not exactly sure what you're looking for, but depending on how you implement you data on the graph, you can easily work out clustering algorithms based on counting links to various type of nodes.

If you plan correctly you nodes and relationships, you can then identify group of nodes that share most common links to a set of category.

Upvotes: 2

Related Questions