Reputation: 1697
I'm working on a recommender system using two basic entities: users and objects. User similarity metrics will be pre-calculated based on existing user data. Then, as various users "flag" objects, objects will be recommended to each user (based on what's been flagged by similar users).
I'm new to NoSQL and unsure what the best way to model a) user flag events, and b) user-specific recommendations. Two options seem obvious to me:
1) "Heavyweight" option: store all relevant data in the primary objects. E.g.:
UserA
FlaggedItems
FlaggedItemA
FlaggedItemB
FlaggedItemC
RecommendedItems
RecommendedItemA
RecommendedItemB
RecommendedItemC
or:
ItemA
FlaggedBy
UserA
UserC
UserR
RecommendedTo
UserB
UserD
UserX
2) "Lightweight" option: store "Flag" and "Recommendation" data in granular objects. E.g.:
FlagEvent
FlaggedBy
UserA
FlaggedItem
ItemA
DateTime
RecommendationEvent
RecommendationTo
UserC
RecommendedItem
ItemB
DateTime
My assumption is that the lightweight method would be more scaleable as the User/Item objects wouldn't be constantly modified, client synchronization would involve grabbing user-specific FlagEvents and RecommendationEvents, and there would be no chance of multiple users trying to modify the same object simultaneously. But I'm new to CouchDB/noSQL and welcome thoughts from more experienced users. What would you suggest?
Upvotes: 2
Views: 384
Reputation: 73722
In general, the FlagEvent
and RecommendationEvent
system is most like typical CouchDB models.
With recommendations, having a document per "event" is neat because a user's big-picture recommendation summary is probably a reduction of those events. "Here is your top recommendation. And here are some others you might like." Something like that.
By adding, changing, or deleting individual "atomic" recommendation items, you influence the final output.
Similarly, having a flag event works the same way. Typically, a flag (or "like", or "+1" or whatever) is unique for a user and item. Therefore you might use the _id
to store something like username eventid
pairs. Then it will be impossible to flag something twice because every user/item combo has 1 and only 1 document to represent that flag. Create or delete documents to flag/unflag for a user.
Obviously, you know your data the best. But those are my first ideas. Of course, when somebody says "recommendation engine," people often immediately think "graph database" and not "document database"—however I do not know any high-profile recommendation engines built on open source graph databases (yet).
Upvotes: 2