Alex
Alex

Reputation: 6109

How should I or should not use Cassandra and Redis together to build a scalable one on one chat application?

Up until now I have used MySQL to do pretty much everything, but I don't like the thought of sharding my data manually and maintaining all of that for now.

I want to build a one on one chat application that is like Facebook and WhatsApp like the picture below:

enter image description here

So we have two parts here. The right part which is just all messages in a chat thread, and the left part that shows chat threads with information from the last message, and your chat partners information such as name and image and so on.

So far this is what I have:

Cassandra is really good at writing, and reading. But not so much at deleting data because of tombstones. And you don't want to set gc_grace_seconds to 0 because if a node goes down and deletes occur then that deleted row might come back to life when repair is done. So we might end up deleting all data from the node before it enters the cluster. Anyways, as I understood Cassandra would be perfect for the right part of this chat app. Since messages will be stored and ordered by their insertion time and that sorting will never change. So you just write and read. Which is what Cassandra is good at.

I have these table to store messages for the right part:

CREATE TYPE user_data_for_message (
    from_id INT,
    to_id INT,
    from_username TEXT,
    to_username TEXT,
    from_image_name TEXT,
    to_image_name TEXT
);

CREATE TABLE message_by_thread_id (
    message_id TIMEUUID,
    thread_id UUID,
    user_data FROZEN <user_data_for_message>,
    message TEXT,
    created_time INT,
    is_viewed BOOLEAN,
    PRIMARY KEY (thread_id, message_id)
) WITH CLUSTERING ORDER BY (message_id DESC);

Before I insert a new message, if the thread_id is not provided by the client, I can do a check whether a thread between two users exist. I can store that information like this:

CREATE TABLE message_thread_by_user_ids (
    thread_id UUID,
    user1_id INT,
    user2_id INT,
    PRIMARY KEY (user1_id, user2_id)
);

I could store two rows for every thread where user1 and user2 has reversed order, so that I just need to do 1 read to check for the existence. Since I don't want to check for existence of thread before every insert, I could first check if there exists a thread between users in Redis since it is in memory and much faster.

I could save the same information above in Redis too like this (not two way as I did in Cassandra, but one way to save memory. We can do two GET to check for it):

SET user:1:user:2:message_thread_id 123e4567-e89b-12d3-a456-426655440000

So before I send a message, I could first check in Redis whether there exists a thread between the two users. If not found in Redis, I could check in Cassandra, (in case Redis server was down at some point and did not save it), and if there exists a thread just use that thread_id to insert a new message, if not then create the thread, and then insert it in the table:

message_thread_by_user_ids

Insert it in Redis with the SET command above. And then finally insert the message in:

message_by_thread_id

Ok now comes the tricky part. The left part of the chat does not have static sorted order. The ordering changes all the time. If a conversation has a new message, then that conversation goes to the top. So I have not found a good way to model this in Cassandra without doing deletes and inserts. I have to delete a row and then insert it in order for the table to reorder the rows. And to delete a row and insert a row in a table, every time I send a message does not sound like a good idea to me, but I might be wrong, I am not experienced with Cassandra.

So my thought was that I could use Redis for that left part, but the only problem is that if Redis server goes down, then the most recent chat conversations on the left side will be lost, even tho the chat itself will be preserved in Cassandra. Users would need to resend message for the conversation to appear again.

I thought I could do this in Redis in the following way:

Every time a user sends a message, for example if user 1 sends message to user 2 I could do this:

ZADD user:1:message_thread_ids 1510624312 123e4567-e89b-12d3-a456-426655440000

ZADD user:2:message_thread_ids 1510624312 123e4567-e89b-12d3-a456-426655440000

The sorted set will keep track of the id of the threads with most recently active conversations sorted by unix timestamp.

But then another problem is that every time I load this window I have to do ZRANGE and for example get 20 of the most recent conversations on the left side, and then do 20 single SELECT statements with LIMIT 1 in Cassandra to get the information about the last message sent, that is perhaps not so efficient. I thought I could save information for the last message for 20 most recent active conversations in redis with HMSET with the most relevant information such as the message itself trimmed down to 60 characters only, last_message timestamp, from_username, to_username, from_id, to_id, from_image, to_image, and message_id.

HMSET thread:123e4567-e89b-12d3-a456-426655440000 <... message info ...>

But now I have to keep track and delete the hash maps from Redis that are not relevant, since I don't want to keep more than most recent 20, since it is going to eat up memory fast. I will get the most recent 20 from Redis and from memory, and if a user scrolls down then I will get 10 at a time from Cassandra. And the other problem is that if the Redis server goes down I might loose a conversation from the left side of the app, if the conversation is a completely new conversation.

I thought that with this approach I can get a lot of writes per second on the Cassandra side by just adding new nodes, and Redis can do like 200 000 - 800 000 operations per second so doing deletes and adding things to sorted set should not be a limitation. Since there will be some back and forth from the Redis server, I could try to either pipeline the Redis commands or write Lua scripts so that I can send the instructions to Redis in one go.

Is this a good idea? How can I solve this issue of the left side of the app that shows active conversations? Is it a good idea to do this in Redis like I suggested or should I do it differently?

Upvotes: 7

Views: 3223

Answers (1)

BestPractice2Go
BestPractice2Go

Reputation: 305

Both are good solutions. But where could be bottlenecks?

1) Redis is limited to memory and can not exceed it. Also when Server shutdowns, u lose your data.

2) When it comes to scaling, redis uses Master-Slave topology with sharding where as Cassandra uses a ring-topology where every node is equal to write and reads.

In my opinion I would rather use Cassandra knowing it isn't as fast as Redis but fast enough and very easy to scale.

Is this a good idea? How can I solve this issue of the left side of the app that shows active conversations? Is it a good idea to do this in Redis like I suggested or should I do it differently?

How do your user write with each other, I think u do this with a websocket, dont you? If yes, just track socket-ID and remove it when socket disconnects.

Another Question is, where and how do you retrieve the friend Ids for a certain person (left side on your picture)?

Upvotes: 1

Related Questions