acemtp
acemtp

Reputation: 3019

how to manage millions/billions of small values in a "database"

I have an application that will generate millions of date/type/value entries. we don't need to do complex queries, only for example get the average value per day of type X between date A and B.

I'm sure a normal db like mysql isn't the best to handle these sort of things, is there a better system that like these sort of data.

EDIT: The goal is not to say that relational database cannot handle my problem but to know if another type of database like key/value database, nosql, document oriented, ... can be more adapted to what i want to do.

Upvotes: 0

Views: 359

Answers (4)

Paddy3118
Paddy3118

Reputation: 4772

You could store the data as fixed length records in a file. Do binary search on the file opened for random access to find your start and end records then sum the appropriate field for the given condition of all records between your start index and end index into the file.

Upvotes: 0

jing
jing

Reputation: 2021

MS SQL Server and Oracle offer concept of Partitioned Tables and Indexes.

In short: you could group your rows by some value, i.e. by year and month. Each group could be accessible as separate table with own index. So you can list, summarize and edit February 2011 sales without accessing all rows. Partitioned Tables complicate the database, but in case of extremely long tables it could lead to significantly better performance.

Upvotes: 1

orangepips
orangepips

Reputation: 9961

If you are dealing with a simple table as such:

CREATE TABLE myTable (
    [DATE] datetime,
    [TYPE] varchar(255),
    [VALUE] varchar(255)
)

Creating an index probably on TYPE,DATE,VALUE - in that order - will give you good performance on the query you've described. Use explain plan or whatever equivalent on the database you're working with to review the performance metrics. And, setup a scheduled task to defragment that index regularly - frequency will depend on how often insert, delete and update occurs.

As far as an alternative persistence store (i.e. NoSQL) you don't gain anything. NoSQL shines when you want schema-less storage. In other words you don't know the entity definitions head of time. But from what you've described, you have a very clear picture of what you want to store, which lends itself well to a relational database.

Now possibilities for scaling over time include partitioning and each TYPE record into a separate table. The partitioning piece could be done by type and/or date. Really would depend on the nature of the queries you're dealing with, if you typically query for values within the same year for instance, and what your database offers in that regard.

Upvotes: 1

sqlmaster
sqlmaster

Reputation: 34

Based upon the costs you can choose either MySQL or SQL Server, in this case you have to be clear that what do you want to achieve with the database just for storage then any RDBMS can handle.

Upvotes: 0

Related Questions