Reputation: 15
I'll try to keep this brief. I have a hyper table for stock data called tick_data. I then create a continuous aggregation of 5 seconds of this data. With this I'm trying to create 1 minute calculations but more granular than having a regular 1 minute aggregation. For instance, every 5 seconds I want to keep track of what the 1 minute window of times average volume is. I want this to be able to go back thousands of windows into the past.
So instead of only calculating avg minute volumes in the defined time slots of say 1:10:00 to 1:11:00. I want to have a new calculation for 1:10:00 to 1:11:00, then 1:10:05 to 1:11:05, 1:10:10 to 1:11:10, etc.
I tried creating a trigger function off of the 5 second aggregation to keep a new hyper table of the rolling 1 minute volumes that I would do a new aggregation on but got an error about creating a function.
I'm sure there is probably something with a window function. I'm very new with back-end development. It seems that after you create an aggregation you're limited on what you can do with the aggregation data. I could try something directly on the tick_data but was trying to reduce computation since that could be thousands of computations a second potentially. Maybe the optimizations of Timescale make that not matter I do not know. Looking for advice. Thank you, here are my table creations I have so far.
CREATE TABLE IF NOT EXISTS tick_data (
timestamp TIMESTAMPTZ NOT NULL,
symbol TEXT NOT NULL,
price FLOAT NOT NULL,
size INTEGER NOT NULL,
tick_type TEXT,
condition TEXT,
session TEXT,
dollars FLOAT
SELECT create_hypertable(
'tick_data',
'timestamp',
partitioning_column => 'symbol',
number_partitions => 25,
chunk_time_interval => INTERVAL '1 day',
if_not_exists => TRUE
CREATE MATERIALIZED VIEW IF NOT EXISTS agg_5s
WITH (timescaledb.continuous) AS
SELECT
time_bucket('5 seconds', timestamp) AS bucket_timestamp,
symbol,
session,
-- OHLC aggregations
first(price, timestamp) AS open_price,
last(price, timestamp) AS close_price,
max(price) AS high_price,
min(price) AS low_price,
-- Total Volume and Tick Aggregations
sum(size) AS total_volume,
count(*) AS total_ticks,
sum(dollars) AS total_dollars,
-- Aggregations Based on `tick_type`
sum(CASE WHEN tick_type = 'up' THEN size ELSE 0 END) AS up_volume,
count(CASE WHEN tick_type = 'up' THEN 1 END) AS up_ticks,
sum(CASE WHEN tick_type = 'up' THEN dollars ELSE 0 END) AS up_dollars,
sum(CASE WHEN tick_type = 'down' THEN size ELSE 0 END) AS down_volume,
count(CASE WHEN tick_type = 'down' THEN 1 END) AS down_ticks,
sum(CASE WHEN tick_type = 'down' THEN dollars ELSE 0 END) AS down_dollars,
sum(CASE WHEN tick_type = 'side' THEN size ELSE 0 END) AS side_volume,
count(CASE WHEN tick_type = 'side' THEN 1 END) AS side_ticks,
sum(CASE WHEN tick_type = 'side' THEN dollars ELSE 0 END) AS side_dollars
FROM
tick_data
GROUP BY
bucket_timestamp, symbol, session;
EDIT: This post seems to be doing pretty much the same kind of thing that I want to do but the syntax is screwing me up with my lack of understanding back end. Is it possible to calculate a cumulative sum or moving average with a TimescaleDB continuous aggregate?. So I probably need a SELECT AS that then does sum(over last12) where last12 is the preceding minute for 12, 5 second periods in a minute and then... This is where I'm really lost as then how do I keep the rolling average OF those last12 bucket values over the last X amount of buckets...
EDIT 2: OOOk hold on I am starting to understand that I need to do a stats agg here but I don't know the full syntax of this. I'm not sure if I can do this with a calculated number or not but like
CREATE MATERIALIZED VIEW IF NOT EXISTS agg_5s
WITH (timescaledb.continuous) AS
SELECT
time_bucket('5 seconds', timestamp) AS bucket_timestamp,
symbol,
session,
first(price, timestamp) AS open_price,
last(price, timestamp) AS close_price,
max(price) AS high_price,
min(price) AS low_price,
sum(size) AS total_volume,
count(*) AS total_ticks,
sum(dollars) AS total_dollars,
-- Aggregations Based on `tick_type`
sum(CASE WHEN tick_type = 'up' THEN size ELSE 0 END) AS up_volume,
count(CASE WHEN tick_type = 'up' THEN 1 END) AS up_ticks,
sum(CASE WHEN tick_type = 'up' THEN dollars ELSE 0 END) AS up_dollars,
sum(CASE WHEN tick_type = 'down' THEN size ELSE 0 END) AS down_volume,
count(CASE WHEN tick_type = 'down' THEN 1 END) AS down_ticks,
sum(CASE WHEN tick_type = 'down' THEN dollars ELSE 0 END) AS down_dollars,
sum(CASE WHEN tick_type = 'side' THEN size ELSE 0 END) AS side_volume,
count(CASE WHEN tick_type = 'side' THEN 1 END) AS side_ticks,
sum(CASE WHEN tick_type = 'side' THEN dollars ELSE 0 END) AS side_dollars,
stats_agg(total_volume)
FROM
tick_data
GROUP BY
bucket_timestamp, symbol, session;
And then I need to reference this in a window function query on the aggregated data. Questions now are how do I use the stats_agg on more than one variable in the aggregated data as eventually I would like to add more than just the total volume to this. I also am quite confused on the rolling window function and how the math works on that.
Upvotes: 1
Views: 79