Reputation: 1786
At the moment I have several tables with around 4000 columns in sum and approx. 1 Mio. rows each with the same index column.
I have build partitioning manually to split these columns into batches of ~1500 (bc. psql max. amount of columns is 1600).
Question: Is there a managed and more efficient way similar to partitioning on a range of a specific column?
Question: Do you think it is suitable way to use citus 10 COLUMNAR
support and remove the primary key on the index?
Upvotes: 0
Views: 227
Reputation: 31
Citus 10 COLUMNAR storage seems like a good candidate for your use-case. It uses Projection Pushdown, meaning that if the queries you usually run are targeting a few columns, they will skip over the columns they don’t need.
Regarding your second question, there is some sort of “indexing” in Columnar. Queries use Chunk Group Filtering, which allows them to skip over chunk groups of data for certain filters in the columns. Let me copy & paste the relevant section from a blog post about Citus columnar:
Chunk Group Filtering allows queries to skip over Chunk Groups of data if the metadata indicates that none of the data in the chunk group will match the predicate. In other words, for certain kinds of queries and data sets, it can skip past a lot of the data quickly, without even decompressing it!
Upvotes: 2