C* Modeling a timeLine

Question

Just for fun I am building a tweeter clone to get a better understanding of C*

All the suggested C* schemes that I have seen around are using more or less the same modeling technique. The issue is that I have my doubts about the scalability of modeling the twitter timeline in this fashion.

The problem: What will happen if I have a userA (rock star) or more that is extremely popular and is followed by 10k+ users? Each time the userA publishes a tweet we will have to insert into the timeline table 10k+ tweets for each of his followers.

Questions: Will this model really scale? Can anyone suggest me an alternative ways of modeling the timeline that can really scale?

C* Schema:

CREATE TABLE users (
 uname text, -- UserA
 followers set, -- Users who follow userA
 following set, -- UserA is following userX
 PRIMARY KEY (uname)
);
-- View of tweets created by user
CREATE TABLE userline (
 tweetid timeuuid,
 uname text,
 body text,
 PRIMARY KEY(uname, tweetid)
);
-- View of tweets created by user, and users he/she follows
CREATE TABLE timeline (
 uname text,
 tweetid timeuuid,
 posted_by text,
 body text,
 PRIMARY KEY(uname, tweetid)
);


-- Example of UserA posting a tweet:
-- BATCH START
-- Store the tweet in the tweets
INSERT INTO tweets (tweetid, uname, body) VALUES (now(), 'userA', 'Test tweet #1');

-- Store the tweet in this users userline
INSERT INTO userline (uname, tweetid, body) VALUES ('userA', now(), 'Test tweet #1');

-- Store the tweet in this users timeline
INSERT INTO timeline (uname, tweetid, posted_by, body) VALUES ('userA', now(), 'userA', 'Test tweet #1');

-- Store the tweet in the public timeline
INSERT INTO timeline (uname, tweetid, posted_by, body) VALUES ('#PUBLIC', now(), 'userA', 'Test tweet #1');

-- Insert the tweet into follower timelines
-- findUserFollowers = SELECT followers FROM users WHERE uname = 'userA';
for (String follower : findUserFollowers('userA')) {
INSERT INTO timeline (uname, tweetid, posted_by, body) VALUES (follower, now(), 'userA', 'Test tweet #1');
}
-- BATCH END

Thanks in advance for any suggestions.

C* Modeling a timeLine

Answers (1)

Related Questions