Chamila Wijayarathna
Chamila Wijayarathna

Reputation: 1941

Graphically Interpreting Cassandra data schema

As ER diagrams for relational schemas, if there anyway to graphycally interpret schema created for Cassandra?

Upvotes: 1

Views: 2141

Answers (2)

lbruand
lbruand

Reputation: 21

I wrote the tool cql2plantuml that extracts a plantuml .puml file from a CQL schema.

You still have to do some editing of the .puml file for the relations in the schema through as a Cassandra Keyspace does not contain any representation for the relations between tables.

Upvotes: 2

phact
phact

Reputation: 7305

There are many ways of going about this and I would recommend checking out DataStax's Data Modeling training for a systematic in depth look.

Actually building an ordinary ERD and a list of expected queries may be a good step to getting your data model right.

Once you have this you want to convert it to a Cassandra specific diagram where you represent primary keys, clustering keys, and even secondary indexes (but only for low cardinality fields). Remember, multiple entities in your ERD may translate into one C* table, and you may end up duplicating some of your writes in order to improve read performance and allow for different types of queries. A simple example might look like the following:

Reviews_by_Day
userid text       P
day int           C
productid text
reviewid uuid
profilename text 
helpfulness text
score text
summary text 
review text 
time timestamp

You can also specify asc / desc in your clustering columns. The diagram above would represent the following table:

CREATE TABLE reviews_by_day
(
userid text,
day int,
productid text, 
reviewid uuid,
profilename text, 
helpfulness text,
score text, 
summary text, 
review text, 
time timestamp,
PRIMARY KEY (userid, day)
)

Combine this with a list of expected queries you will perform on c* and think about the tables that will be used for each. You can augment the diagram by adding your queries (labled as Q1, Q2, etc) and using arrows to demonstrate application flow.

One more tool that may be useful is this data modeling application which allows you to type in your table definition and see how it is stored in the Cassandra storage engine under the hood (currently it does not support collections). It also lets you compute the estimated partition size for your table and generates a sample .yaml file for use with Cassandra's new cassandra-stress from C* 2.1 (which is backwards compatible with 2.0).

Note: This tool is in development and may change.

Upvotes: 3

Related Questions