user1136875
user1136875

Reputation: 613

Good pratice for SQL message table

I have a table of messages

Key - UserId, MsgIndex
C1...Cn - some data columns
Cn+1 - Date, when message has been added to the table.

Question is the best practice to do with client...

Client ask server for new messages...

There are 2 options to do it:

  1. check for messages with index > lastRxMsgIndex (client will save the last msg index received)
  2. check for messages with date > lastRxMsgDate (client will save the last rx msg date- server will give it to him when getting msg results)

Which is better and faster...

Keeping date/TS or index is the same, common sense says to keep date/TS but it is same for msg index.

MsgIndex is in the table primary key so should it be faster that searching on dates (when user will have many messages...)

Which is the best way?

Thanks Yoav

Upvotes: 0

Views: 385

Answers (4)

Mark Byers
Mark Byers

Reputation: 838696

First regarding performance: you could add an index to your date column to improve the performance of searches by date. You will most likely want to also include the user_id in the index. You could for example use a combined index on (user_id, id) or (user_id, date) so that individual users can quickly find the messages they own without the server having to scan through other users' messages too.

Regarding functionality: One potential issue with using the datetime as a key is that timestamps are not in general unique. It's possible (but unlikely) that if you search based on date you'll miss a message. Here's an example scenario demonstrating the problem:


At 16:01:04.312 the table contains two messages:

id    date                  message
1     2012-02-10 14:23:54   foo
2     2012-02-10 16:01:04   bar

The client already has received row 1 previously and now requests and receives the latest row:

SELECT * FROM your_table WHERE date > '2012-02-10 14:23:54'
(1 row)

Then at 16:01:04.420 a new row comes into the database with the same timestamp:

id    date                  message
1     2012-02-10 14:23:54   foo
2     2012-02-10 16:01:04   bar
3     2012-02-10 16:01:04   baz

Client requests latest row but doesn't get it:

SELECT * FROM your_table WHERE date > '2012-02-10 16:01:04'
(0 rows)

Another issue is if the server's time is adjusted backwards. This could cause later messages to be inserted with an earlier timestamp. These messages will also be missed if you use the date to find the newest messages. It may be better to use the id instead to avoid these potential problems.

Upvotes: 2

Matt Gibson
Matt Gibson

Reputation: 14959

If you have an index on the date field there should be no difference at all (assuming you are using a unix timestamp) because you have two fields with integers in ascending order, where you want all posts where the integer is greater than X.

There is a possibility of non-unique values in the timestamp column, but this will not be a problem unless you use aggregation functions and in that case you can fix it by including the unique primary key in the fields that you select.

Given this fact, you only need to consider:

  • Whether the (very small) overhead of calculating the index on the date column each time data is inserted is worth it
  • Whether the code you write is more readable with ids used or timestamps used.

Personally, I would opt for the timestamp field as it is immediately clear to anyone else reading the code what you are doing, whereas using the id is a bit obscure, and the indexing overhead is trivial.

Upvotes: 1

Adrian
Adrian

Reputation: 5681

The former way you mention is guaranteed to work if you have a lot of incoming messages.
The latter way, if you have 2 messages on the same date (say same second because I don't think you can go more fine grained), you are screwed ;)

I used the 'go by index' method before when I was working on a system which receives lots of messages per second.

Upvotes: 1

user1175165
user1175165

Reputation: 41

Since MsgIndex is the primary key, it has an index. So the access via the MsgIndex is quite faster than using date comparison.

Upvotes: 2

Related Questions