Reputation: 1944
How is this possible? What is it about NoSQL that gives it a higher write throughput than some RDBMS? Does it boil down to scalability?
Upvotes: 4
Views: 6084
Reputation: 161
A higher write throughput can also be credited to the internal data structures that power the database storage engine.
Even though B-tree implementations used by some RDBMS have stood the test of time, LSM-trees used in some key-value datastores are typically faster for writes:
1: When a write comes, you add it to the in-memory balanced tree, called memtable.
2: When the memtable grows big, it is flushed to the disk.
To understand this data structure better, please check this video and this answer.
Upvotes: 1
Reputation: 1944
In summary, NoSQL databases are built to easily scale across a large number of servers (by sharding/horizontal partitioning of data items), and to be fault tolerant (through replication, write-ahead logging, and data repair mechanisms). Furthermore, NoSQL supports achieving high write throughput (by employing memory caches and append-only storage semantics), low read latencies (through caching and smart storage data models), and flexibility (with schema-less design and denormalization).
From:
Open Journal of Databases (OJDB) Volume 1, Issue 2, 2014 www.ronpub.com/journals/ojdb ISSN 2199-3459
https://estudogeral.sib.uc.pt/bitstream/10316/27748/1/Which%20NoSQL%20Database.pdf - page 19
Upvotes: 7
Reputation: 721
Some noSQL systems are basically just persistent key/value storages (like Project Voldemort). If your queries are of the type "look up the value for a given key", such a system will (or at least should be) faster that an RDBMS, because it only needs to have a much smaller feature set.
Another popular type of noSQL system is the document database (like CouchDB). These databases have no predefined data structure. Their speed advantage relies heavily on denormalization and creating a data layout that is tailored to the queries that you will run on it. For example, for a blog, you could save a blog post in a document together with its comments. This reduces the need for joins and lookups, making your queries faster, but it also could reduce your flexibility regarding queries.
There are many NoSQL solutions around, each one with its own strengths and weaknesses, so the following must be taken with a grain of salt.
But essentially, what many NoSQL databases do is rely on denormalization and try to optimize for the denormalized case. For instance, say you are reading a blog post together with its comments in a document-oriented database. Often, the comments will be saved together with the post itself. This means that it will be faster to retrieve all of them together, as they are stored in the same place and you do not have to perform a join.
Of course, you can do the same in SQL, and denormalizing is a common practice when one needs performance. It is just that many NoSQL solutions are engineered from the start to be always used this way. You then get the usual tradeoffs: for instance, adding a comment in the above example will be slower because you have to save the whole document with it. And once you have denormalized, you have to take care of preserving data integrity in your application.
Moreover, in many NoSQL solutions, it is impossible to do arbitrary joins, hence arbitrary queries. Some databases, like CouchDB, require you to think ahead of the queries you will need and prepare them inside the DB.
All in all, it boils down to expecting a denormalized schema and optimizing reads for that situation, and this works well for data that is not highly relational and that requires much more reads than writes.
This link explains a lot moreover where: RDBMS -> data integrity is a key feature (which can slow down some operations like writing) NoSQL -> Speed and horizontal scalability are imperative (So speed is really high with this imperatve)
AAAND... The thing about NoSQL is that NoSQl cannot be compared to SQL in any way. NoSQL is name of all persistence technologies that are not SQL. Document DBs, Key-Value DBs, Event DBs are all NoSQL. They are all different in almost all aspects, be it structure of saved data, querying, performance and available tools.
Hope it is useful to understand
Upvotes: 9