Reputation: 613
I have a table of messages
Key - UserId, MsgIndex
C1...Cn - some data columns
Cn+1 - Date, when message has been added to the table.
Question is the best practice to do with client...
Client ask server for new messages...
There are 2 options to do it:
index > lastRxMsgIndex
(client will save the last msg index received)date > lastRxMsgDate
(client will save the last rx msg date- server will give it to him when getting msg results)Which is better and faster...
Keeping date/TS or index is the same, common sense says to keep date/TS but it is same for msg index.
MsgIndex is in the table primary key so should it be faster that searching on dates (when user will have many messages...)
Which is the best way?
Thanks Yoav
Upvotes: 0
Views: 385
Reputation: 838696
First regarding performance: you could add an index to your date column to improve the performance of searches by date. You will most likely want to also include the user_id
in the index. You could for example use a combined index on (user_id, id)
or (user_id, date)
so that individual users can quickly find the messages they own without the server having to scan through other users' messages too.
Regarding functionality: One potential issue with using the datetime as a key is that timestamps are not in general unique. It's possible (but unlikely) that if you search based on date you'll miss a message. Here's an example scenario demonstrating the problem:
At 16:01:04.312 the table contains two messages:
id date message
1 2012-02-10 14:23:54 foo
2 2012-02-10 16:01:04 bar
The client already has received row 1 previously and now requests and receives the latest row:
SELECT * FROM your_table WHERE date > '2012-02-10 14:23:54'
(1 row)
Then at 16:01:04.420 a new row comes into the database with the same timestamp:
id date message
1 2012-02-10 14:23:54 foo
2 2012-02-10 16:01:04 bar
3 2012-02-10 16:01:04 baz
Client requests latest row but doesn't get it:
SELECT * FROM your_table WHERE date > '2012-02-10 16:01:04'
(0 rows)
Another issue is if the server's time is adjusted backwards. This could cause later messages to be inserted with an earlier timestamp. These messages will also be missed if you use the date to find the newest messages. It may be better to use the id
instead to avoid these potential problems.
Upvotes: 2
Reputation: 14959
If you have an index on the date field there should be no difference at all (assuming you are using a unix timestamp) because you have two fields with integers in ascending order, where you want all posts where the integer is greater than X.
There is a possibility of non-unique values in the timestamp column, but this will not be a problem unless you use aggregation functions and in that case you can fix it by including the unique primary key in the fields that you select.
Given this fact, you only need to consider:
Personally, I would opt for the timestamp field as it is immediately clear to anyone else reading the code what you are doing, whereas using the id is a bit obscure, and the indexing overhead is trivial.
Upvotes: 1
Reputation: 5681
The former way you mention is guaranteed to work if you have a lot of incoming messages.
The latter way, if you have 2 messages on the same date (say same second because I don't think you can go more fine grained), you are screwed ;)
I used the 'go by index' method before when I was working on a system which receives lots of messages per second.
Upvotes: 1
Reputation: 41
Since MsgIndex
is the primary key, it has an index. So the access via the MsgIndex
is quite faster than using date comparison.
Upvotes: 2