Using CouchDB to model user preferences/recommendations

Question

I'm working on a recommender system using two basic entities: users and objects. User similarity metrics will be pre-calculated based on existing user data. Then, as various users "flag" objects, objects will be recommended to each user (based on what's been flagged by similar users).

I'm new to NoSQL and unsure what the best way to model a) user flag events, and b) user-specific recommendations. Two options seem obvious to me:

1) "Heavyweight" option: store all relevant data in the primary objects. E.g.:

UserA
    FlaggedItems
        FlaggedItemA
        FlaggedItemB
        FlaggedItemC
    RecommendedItems
        RecommendedItemA
        RecommendedItemB
        RecommendedItemC

or:

ItemA
    FlaggedBy
        UserA
        UserC
        UserR
    RecommendedTo
        UserB
        UserD
        UserX

2) "Lightweight" option: store "Flag" and "Recommendation" data in granular objects. E.g.:

FlagEvent
    FlaggedBy
        UserA
    FlaggedItem
        ItemA
    DateTime

RecommendationEvent
    RecommendationTo
        UserC
    RecommendedItem
        ItemB
    DateTime

My assumption is that the lightweight method would be more scaleable as the User/Item objects wouldn't be constantly modified, client synchronization would involve grabbing user-specific FlagEvents and RecommendationEvents, and there would be no chance of multiple users trying to modify the same object simultaneously. But I'm new to CouchDB/noSQL and welcome thoughts from more experienced users. What would you suggest?

JasonSmith · Accepted Answer

In general, the FlagEvent and RecommendationEvent system is most like typical CouchDB models.

With recommendations, having a document per "event" is neat because a user's big-picture recommendation summary is probably a reduction of those events. "Here is your top recommendation. And here are some others you might like." Something like that.

By adding, changing, or deleting individual "atomic" recommendation items, you influence the final output.

Similarly, having a flag event works the same way. Typically, a flag (or "like", or "+1" or whatever) is unique for a user and item. Therefore you might use the _id to store something like username eventid pairs. Then it will be impossible to flag something twice because every user/item combo has 1 and only 1 document to represent that flag. Create or delete documents to flag/unflag for a user.

Obviously, you know your data the best. But those are my first ideas. Of course, when somebody says "recommendation engine," people often immediately think "graph database" and not "document database"—however I do not know any high-profile recommendation engines built on open source graph databases (yet).

Using CouchDB to model user preferences/recommendations

Answers (1)

Related Questions