Reputation: 25
Scenario as below:
So I have a READ_EVENT table in DaseBase to save all the events.
When a cached READ_EVENT comes, I have to review all the history data to find out if:
To be clear, the "integrate" means if the two duration(from "first_seen_time" to "last_seen_time") of READ_EVENT in DB and cached READ_EVENT have a common period.
Here is the question:
Because the "first_seen_time"/"last_seen_time" in cached data could be any time (yesterday, last month, last year), and the table becomes larger and larger, it will be very very difficult to position the READ_EVENT which should be integrated. How to optimize the design of the database.
Upvotes: 0
Views: 36
Reputation: 1741
You'll need one or more indexes, depending on your exact requirements and the database queries resulting from those. Having those indexes, the total number of records will be of minor importance, as a query on a small index range will still be fast and yield no more than a couple of records.
Assume that you have an index on reader_id, card_id and last_seen_time. Now you want to know whether a current event on that reader and card can be joined with the previous events. As far as I understand your requirements, that can only affect the latest record. So a query like
select *
from read_event
where reader_id = :reader_id and card_id = :card_id
order by last_seen_time desc
will yield the latest event as first record, so only one record needs to be fetched from the database, independent on how many records it contains.
Of course, if the record count becomes huge, this might become a problem in terms of sheer space or other use cases working on those data. Now, from your numbers, let's estimate 3 million records per year. After three years, you have 10 million. Do you still need each of these records after three years? So the next step would be to decide, from your functional requirements, how long you need old data on a record-per-record basis, and to aggregate them according to the actual needs of everyday business. But I would perform such a step only when the actual needs arise. Often, the requirements on a collection of data only evolve with their usage. So, if you throw data away too early, a use case might appear for which they are needed. That's the premature optimization anti-pattern.
Upvotes: 1