Repetitive varchar columns taking a lot of storage on my questdb table

Question

I have a table where I am storing queries executed by users with some metadata. The table expires after 1 day.

CREATE TABLE '_query_trace' ( 
    ts TIMESTAMP,
    query_text VARCHAR,
    execution_micros LONG,
    principal VARCHAR
) timestamp(ts) PARTITION BY HOUR TTL 1 DAY BYPASS WAL
WITH maxUncommittedRows=500000, o3MaxLag=5000000us;

I have created a materialized view, to keep hourly statistics of execution per user and query

create materialized view query_stats AS (
    select ts, principal,query_text, count() as executions, sum(execution_micros) as execution_micros, avg(execution_micros) as avg_execution_micros
    FROM queries
    SAMPLE BY 1h
) PARTITION BY DAY;

The problem I have with this view is that the query_text column can be very large. About a third of the queries have over 4000 characters, so I was thinking what would be a good solution for this. I discarded using SYMBOL as very often the queries are identical (querying by relative time-ranges like WHERE timestamp in today(), but some dashboard tools, like grafana, interpolate automatically absolute time in the queries, which makes the values potentially very high cardinality.

Any ideas to work around this?

Repetitive varchar columns taking a lot of storage on my questdb table

Answers (1)

Related Questions