anomareh
anomareh

Reputation: 5574

Activity streams / feeds, to denormalize or not?

I know variations of this question have been asked many times before (and I've read them, 2 of them being: 1, 2), but I just can't wrap my head around anything that just feels like the right solution.

Everything has been suggested from many to many relations, to fanout, to polymorphic associations, to NoSQL solutions, to message queues, to denormalization and combinations of them all.

I know this question is very situational, so I'll briefly explain mine:

For the mean time, I ended up going with a denormalized setup basically being made up of an events table consisting of: id, date, user_id, action, root_id, object_id, object, data.

user_id being the person that triggered the event.
action being the action.
root_id being the user the object belongs to.
object being the object type.
data containing the minimum amount of information needed to render the event in a user's stream.

Then to get the desired events, I just grab all rows in which the user_id is the id of a user being followed by whose stream we're grabbing.

It works, but the denormalization just feels wrong. Polymorphic associations seem similarly so. Fanout seems to be somewhere in between, but feels very messy.

With all my searching on the issue, and reading the numerous questions here on SO, I just can't get anything to click and feel like the right solution.

Any experience, insight, or help anyone can offer is greatly appreciated. Thanks.

Upvotes: 18

Views: 4367

Answers (2)

Anup Marwadi
Anup Marwadi

Reputation: 2577

I think using a combination of NoSQL/Memcached may suit your needs. Please see this URL for further ideas:

http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture

Upvotes: 0

Denis de Bernardy
Denis de Bernardy

Reputation: 78443

I've never dealt with social activity feeds, but based on your description they're quite similar to maintaining tricky business activity logs.

Personally, it's a case I tend to manage with separate tables for applicable activity types, a revisions/logs table for each of these types, and each of the latter with a reference to a more central event logs table.

The latter allows to build the feed and looks a lot like the solution you came up with: event_id, event_at, event_name, event_by, event_summary, event_type. (The event_type field is a varchar containing the name of the table or object.)

You probably don't need to maintain the history of everything in your case (surely this is less appropriate for friends-requests than for sales and stock movements), but maintaining some kind of central event logs table (in addition to other applicable tables to have the normalized data at hand) is, I think, the correct approach.

You might get some interesting insights by looking at audit log related questions:

https://stackoverflow.com/search?q=audit+log

Upvotes: 2

Related Questions