Hernán Eche
Hernán Eche

Reputation: 6899

Is there special way to manage an always growing database?

Disclaimer: I am newbie in database world

I wonder: How do you solve the problem of an always-growing table?

I mean, suppose I would like to know "the last added item", then I do

SELECT * FROM planetHistory WHERE name="earth" ORDER BY date DESC LIMIT 1;

That seems to be a bad idea because it would take more and more time as table grows. What come to my mind is to "prepare" the design of database for the usual future queries. For this example the database could include a table with a single field and a instance:

+-----------------+
|table "lastAdded"|
+-----------------+
|     3242        |
+-----------------+

That would store the last added, so before doing the insert I should read that field increment it by one and then write it.

It sound weird, but it seems worse for me to order 1 Terabyte of data "just to know which is the last" that's weirder

Upvotes: 0

Views: 379

Answers (2)

Jesse Webb
Jesse Webb

Reputation: 45243

Ordering by an indexed column does this kind of optimization 'under the hood' for you.

Let the DB designers worry about optimizing query logic, just make sure you utilize the tools they give you like INDEXes. It is their job to ensure queries, even over millions of records, are performant enough.

Doing this kind of 'optimization' yourself may cause several other problems:

  1. Duplicated data which could get out of sync
  2. Unnecessary complexity, assuming there was no performance problem in the first place
  3. Make it harder to fix real performance problems when they arise because more complex, code/queries means it is harder to refactor/optimize

Monitor your app/DB so that you can pro-actively address performance problems but don't solve them before you know there is a problem. Especially when it comes to DBs, they are built to be as fast as possible; when they aren't fast, it is usually because of something we do which is silly.

Upvotes: 1

duffymo
duffymo

Reputation: 308753

I'd do it with timestamps. These would also allow you to partition your database so you can eventually move off a chunk of data that's exceeded your "must retain" limit.

There's no such thing as a database that can grow forever. Disk space isn't infinite. If you have any queries that require table scans you'll find that they run slower and slower as your database grows.

Better to give yourself the ability to move data partitions off in chunks by day, week, month, year, or whatever measure works for your use case.

Upvotes: 0

Related Questions