Reputation: 2913
I'm recently diving into Cassandra. However, there is no explicit documentation or ideas about pre defining column and data types. In a column family, cassandra enables dynamic column types like a document oriented database (MongoDb). However, cql enables to pre-define those column types with CREATE TABLE
.
So, it's obvious that forcing column types would decrease the chance of invalid & wrong inserts.
Is there any other advantages doing pre defined column types ? For instance, is there a read performance increase if we have a pre-defined number of columns and their types ?
Upvotes: 1
Views: 200
Reputation: 14153
Because the schema is predefined you have to alter it before you can insert new rows. Using ALTER
allowed for a number of performance enhancements that couldn't be achieved before such as reducing memory taken up by columns that are stored on heap memory.
This overhead is reduced on disk by compaction, but cant be done in memory (and it matters... because reading the memory cache is ofc faster than reading from disk). Handling this will:
If you want the full technical details (including how the developers propose to implent the solution) take a look at the issue on Apache Cassandra's jira.
Just a note The collections that are supported by Cassandra should cover use-cases where adding columns is required (for the sake of clarity I mean CQL columns) so having a static schema also forces the developer to think about their data model, and build it correctly.
I advise you to read this article by jbellis and all the comments that follow, it will clarify most of the points on why the static schema was enforced.
Upvotes: 1