Rolling Window Averages

Question

I'll try to keep this brief. I have a hyper table for stock data called tick_data. I then create a continuous aggregation of 5 seconds of this data. With this I'm trying to create 1 minute calculations but more granular than having a regular 1 minute aggregation. For instance, every 5 seconds I want to keep track of what the 1 minute window of times average volume is. I want this to be able to go back thousands of windows into the past.

So instead of only calculating avg minute volumes in the defined time slots of say 1:10:00 to 1:11:00. I want to have a new calculation for 1:10:00 to 1:11:00, then 1:10:05 to 1:11:05, 1:10:10 to 1:11:10, etc.

I tried creating a trigger function off of the 5 second aggregation to keep a new hyper table of the rolling 1 minute volumes that I would do a new aggregation on but got an error about creating a function.

I'm sure there is probably something with a window function. I'm very new with back-end development. It seems that after you create an aggregation you're limited on what you can do with the aggregation data. I could try something directly on the tick_data but was trying to reduce computation since that could be thousands of computations a second potentially. Maybe the optimizations of Timescale make that not matter I do not know. Looking for advice. Thank you, here are my table creations I have so far.

CREATE TABLE IF NOT EXISTS tick_data (
        timestamp TIMESTAMPTZ NOT NULL,
        symbol TEXT NOT NULL,
        price FLOAT NOT NULL,
        size INTEGER NOT NULL,
        tick_type TEXT,
        condition TEXT,
        session TEXT,
        dollars FLOAT

SELECT create_hypertable(
    'tick_data', 
    'timestamp', 
    partitioning_column => 'symbol',  
    number_partitions => 25,            
    chunk_time_interval => INTERVAL '1 day',
    if_not_exists => TRUE

CREATE MATERIALIZED VIEW IF NOT EXISTS agg_5s
    WITH (timescaledb.continuous) AS
    SELECT
        time_bucket('5 seconds', timestamp) AS bucket_timestamp,
        symbol,
        session,
        -- OHLC aggregations
        first(price, timestamp) AS open_price,
        last(price, timestamp) AS close_price,
        max(price) AS high_price,
        min(price) AS low_price,
        -- Total Volume and Tick Aggregations
        sum(size) AS total_volume,
        count(*) AS total_ticks,
        sum(dollars) AS total_dollars,
        -- Aggregations Based on `tick_type`
        sum(CASE WHEN tick_type = 'up' THEN size ELSE 0 END) AS up_volume,
        count(CASE WHEN tick_type = 'up' THEN 1 END) AS up_ticks,
        sum(CASE WHEN tick_type = 'up' THEN dollars ELSE 0 END) AS up_dollars,
        sum(CASE WHEN tick_type = 'down' THEN size ELSE 0 END) AS down_volume,
        count(CASE WHEN tick_type = 'down' THEN 1 END) AS down_ticks,
        sum(CASE WHEN tick_type = 'down' THEN dollars ELSE 0 END) AS down_dollars,
        sum(CASE WHEN tick_type = 'side' THEN size ELSE 0 END) AS side_volume,
        count(CASE WHEN tick_type = 'side' THEN 1 END) AS side_ticks,
        sum(CASE WHEN tick_type = 'side' THEN dollars ELSE 0 END) AS side_dollars
    FROM
        tick_data
    GROUP BY
        bucket_timestamp, symbol, session;

EDIT: This post seems to be doing pretty much the same kind of thing that I want to do but the syntax is screwing me up with my lack of understanding back end. Is it possible to calculate a cumulative sum or moving average with a TimescaleDB continuous aggregate?. So I probably need a SELECT AS that then does sum(over last12) where last12 is the preceding minute for 12, 5 second periods in a minute and then... This is where I'm really lost as then how do I keep the rolling average OF those last12 bucket values over the last X amount of buckets...

EDIT 2: OOOk hold on I am starting to understand that I need to do a stats agg here but I don't know the full syntax of this. I'm not sure if I can do this with a calculated number or not but like

CREATE MATERIALIZED VIEW IF NOT EXISTS agg_5s
    WITH (timescaledb.continuous) AS
    SELECT
        time_bucket('5 seconds', timestamp) AS bucket_timestamp,
        symbol,
        session,
        first(price, timestamp) AS open_price,
        last(price, timestamp) AS close_price,
        max(price) AS high_price,
        min(price) AS low_price,
        sum(size) AS total_volume,
        count(*) AS total_ticks,
        sum(dollars) AS total_dollars,
        -- Aggregations Based on `tick_type`
        sum(CASE WHEN tick_type = 'up' THEN size ELSE 0 END) AS up_volume,
        count(CASE WHEN tick_type = 'up' THEN 1 END) AS up_ticks,
        sum(CASE WHEN tick_type = 'up' THEN dollars ELSE 0 END) AS up_dollars,
        sum(CASE WHEN tick_type = 'down' THEN size ELSE 0 END) AS down_volume,
        count(CASE WHEN tick_type = 'down' THEN 1 END) AS down_ticks,
        sum(CASE WHEN tick_type = 'down' THEN dollars ELSE 0 END) AS down_dollars,
        sum(CASE WHEN tick_type = 'side' THEN size ELSE 0 END) AS side_volume,
        count(CASE WHEN tick_type = 'side' THEN 1 END) AS side_ticks,
        sum(CASE WHEN tick_type = 'side' THEN dollars ELSE 0 END) AS side_dollars,
        stats_agg(total_volume)
    FROM
        tick_data
    GROUP BY
        bucket_timestamp, symbol, session;

And then I need to reference this in a window function query on the aggregated data. Questions now are how do I use the stats_agg on more than one variable in the aggregated data as eventually I would like to add more than just the total volume to this. I also am quite confused on the rolling window function and how the math works on that.

Rolling Window Averages

Answers (0)

Related Questions