Reputation: 9478
I need to store a very simple data structure on disk - the Point
. It's fields are just:
Moment
- 64-bit integer, representing a time with high precision.EventType
- 32-bit integer, reference to another object.Value
- 64-bit floating point number.Requirements:
Moment
+ EventType
) is unique identifier of the Point
, so I suspect it to be a composite primary key.Points
. Up to 5 billions (1-2 TB of disk space). So the format must be as small as possible.Points
by exact EventType
and a range of Moments
.Questions:
Points
?I don't need a huge professional system with all tools, features and extensions like PostgreSQL or MSSQL. Also I don't need a server, so the choice of SQLite looks optimal. Another great RDBMS with a feature of embedded database is Firebird, but I was seduced by SQLite's dynamic typing paradigm. It looks like it can save me space on disk because integer fields can be stored in "smaller" form (1, 2, 3, 4, 6 bytes).
But first of all, SQLite creates special ROWID
column (64-bits length) when primary key is composite:
CREATE TABLE points (
moment integer not null,
event_id integer not null,
value numeric not null,
PRIMARY KEY (moment, event_id)
);
It means table wastes nearly 40% more space for nothing. I found "The WITHOUT ROWID Optimization". But it will be available only in 3.8.2 version of SQLite (December 2013). Waiting for ADO.NET provider which I need is inappropriate.
Another problem is SQLite uses B-tree for tables. It looks like it is inefficient for selecting data ranges. To select a big block of Points
based on primary key's range it looks SQLite will be a bad choice.
Upvotes: 0
Views: 188
Reputation: 180010
B-trees are the most efficient organization for selecting data ranges.
If you search a constant event_id
value and a range of moment
values, the two-column index can be used for both lookups only if the event_id
is the first column in the index:
CREATE TABLE points (
event_id INTEGER NOT NULL,
moment INTEGER NOT NULL,
value NUMERIC NOT NULL,
PRIMARY KEY (event_id, moment)
);
You should try to use version 3.8.2 so that you can use the WITHOUT ROWID optimization.
The developers are likely to be happy that somebody will test this function, and give you a compiled prerelease version.
Upvotes: 1
Reputation: 453
I think if your table will be use more than one user you should not use embedded DB
What about Oracle - table organized by index (by event perhaps) + partitioning by range
or MySql partition by range
if in your application will be really one user, maybe you can use file system ?
something like partitioned table
you can create folder with name related to range
and create files with name related with event_id so you need store in file only moment + data
even more for example your moment look like
201311141820001234567890123456
you can create folder with name 2013111418 and store in file only part of moment and data
20001234567890123456,data
20001234567890123457,data
Upvotes: 0