Reputation: 577
Folks,
Recently I was reading some of the blogs NOSQL column oriented storage. I am trying my hands on CASSANDRA and HBASE.
What I understood is data is stored in column oriented manner.
e.g. Employee Id , Employee Name, Last Name
100 , 'abc', 'xyz'
200 , 'ABC' , 'XYZ'
Then data will be stored in the following format on the disk (column oriented storage single column together)
First column Second column Third Column
100|200 , 'abc'|'ABC' , 'xyz'|'XYZ'
1 ) I was wondering if we have to retrive single raw with id = 100 how it is done ? Since data is not continuous it will be costly ? (Is there any index with raw key for all columns)
2 ) Why HBASE cassandra is not having proper aggregation function support as Column oriented storage is meant for that ?
Upvotes: 2
Views: 190
Reputation: 15089
simple answer - HBase and Cassandra aren’t column oriented, they are row oriented. The difference to traditional databases however is, that each row is actually a key/value pair of the PK and an arbitarry number of columns.
Column oriented databases are for instance vertica and terra data.
You are however right that retrieving full row from a column oriented storage is more costy than from a row oriented DB. But column oriented DBMS were inveted for analysis, where you usually want to aggregate few columns over all the data, while row oriented is meant for retrieving (almost) full rows from only a small subset of data.
Upvotes: 2