ankita.gulati
ankita.gulati

Reputation: 929

How data is stored in timeseries database

I am curious to know that if i use any timeseries database to store my logs data, which is around thousands of records per second or million records a day, how will timeseries db will store this data internally. If i want to do analysis of data for last 4 months, how will it ensure to respond me quickly?

Upvotes: 3

Views: 1077

Answers (2)

wade zhang
wade zhang

Reputation: 1

It depends on the design of a specific time series database. Generally speaking, a time series database is optimized for storing and querying time series data, this is the fundamental characteristic of a time series database.

For example, in TDengine, an open source, high performance TSDB, the time series data is firstly sharded according to table name while each table stands for a data collection source. Then, inside each table, the data is sliced according to time range so that data can be easily retrieved based on timestamp, and data retention policy can be implemented easily too based on this sharding/slicing mechanism.

Upvotes: -1

Michael Mullany
Michael Mullany

Reputation: 31750

Different time series databases have different strategies for storing data. Your specific use case will dictate which one is right for you, or whether you'd be better off using a search engine like elastic vs a time series database. Timescale was designed by a team that originally built an IOT platform, so some consider Timescale to be the best timeseries database for IOT. (IOT data = bursty, out of order, high cardinality, lives with other metadata)

Timescale uses the underlying postgres storage engine to write data to persistent storage. Its innovation is to add an intermediary layer that chunks data into multiple underlying tables with data from successive time intervals - but it still looks like a single table to users/consumers. You can read more in the documentation.

Apart from its chunking strategy - which happens under the covers - Timescale is a normal PostgreSQL database - so you can do joins, and secondary and partial indexes etc.

(InfluxDB uses a log structured merge tree with machinery for caching and performance optimization. You can read more in the documentation. InfluxDB was originally designed to support web application monitoring.)

(Disclosure - I have an affiliation with timescale)

Upvotes: 1

Related Questions