Reputation: 473
Let's say there are a number of different sensors, all of which save data in a database as they measure it and each sensors can have more entries. I'm looking for the best way to save this data so that select queries could be done fastest possible later. Something like
"CREATE TABLE IF NOT EXISTS DataTable (sensor_id TEXT, measured_value REAL, time_of_measuring REAL)"
could basically work, but I imagine this wouldn't be very fast for selecting. I know about primary keys, but they prevent duplicates, so I can't just put sensor_id as a primary key. I'm basically looking for sqlite equivalent of saving data like this, but in a single table and as one measurement being one row:
data = {"sensor1":[x1,x2,x3], "sensor2":[z1,z2,z3]...}
I imagine something like ˇˇ would work for inserting more than a single value for each sensor, but would that help at all with selecting?
"CREATE TABLE IF NOT EXISTS DataTable (sensor_id TEXT NOT NULL, measured_value REAL, time_of_measuring REAL NOT NULL, PRIMARY KEY(sensor_id, time_of_measuring ))"
Upvotes: 1
Views: 2800
Reputation: 222502
For this time-series data, the relevant primary (or unique) key is probably (time_of_measuring, sensor_id)
. This is close to what you suggested at the end of your question, but the columns are in reverse order.
Technically, this prevent a sensor from loging two measures at the same point in time, which seems like a relevant business rule for your data.
When it comes to the speed of queries: it highly depends on the query themselves. Say that you have query like:
select sensor_id, measured_val, time_of_measuring
from data_table
where
sensor_id = ?
and time_of_measuring >= ?
and time_of_measuring < ?
order by sensor_id, time_of_measuring
This query would take advantage of the primary key index, since the columns are the same as those of the where
and order by
clauses. You could add the measured_val
to the index to make the query even more efficient:
create index data_table_idx1
on data_table(sensor_id, time_of_measuring, measured_val);
As another example, consider this where
clause:
where time_of_measuring >= ? and time_of_measuring < ?
No predicate on sensor_id
, but time_of_measuring
is the first column in the index, so the primary key index can be used.
As typical counter-examples, the following where
clauses would not benefit the index:
where sensor_id = ? -- need an index where `sensor_id` is first
where sensor_id = ? and measured_val >= ? -- needs an index on "(sensor_id, measured_val)"
Upvotes: 1