Emmanuel
Emmanuel

Reputation: 5395

How to setup YugabyteDB as eventually consistent distributed key-value database?

I'm launching a startup company providing Web SaaS (https://tuilder.com/). Big plans and potential.

I'm interested in global replication with YugaByte. At the moment I have built an abstraction over BadgerDB, a Key-Value database written in GoLang. Our abstraction maintains indexes, is kind of graphql-ish, and extremely FAST. Is it possible to use YugaByte DB with global replication as a Key Value store?

I'm aiming for the performance of KeyValue, Globally Distributed.

As I understand the speed of writes decreases with each additional replicated node. Is that right? Is it possible to instead favour speed and have an eventually-consistent model across the nodes instead? We are building JAM stack. So we need an authentication layer on the server between YugaByte and the client, ideally this layer would provide the same abstraction we currently have written in Go.

What about load-balancing between nodes routing requests to the closest geographical location?

Is this all possible with YugaByte Platform?

Upvotes: 5

Views: 901

Answers (2)

Sid Choudhury
Sid Choudhury

Reputation: 190

As I understand the speed of writes decreases with each additional replicated node. Is that right?

The previous answer assumes that additional replicated node is in fact an additional replica. However, if it means a new node in the cluster, then the answer is that a new node does not increase write latency. A new node simply provides more write (and read) throughput to the cluster since it can now host some of the leader and follower shards (aka tablets) present in the cluster. Latency of a key-value write operation is controlled by Replication Factor (RF) of the cluster where typical RF is 3 for production deployments. In such deployments, each shard will have 3 replicas located at 3 separate nodes of the cluster. A write has to commit at the leader replica and at least 1 of the 2 follower replicas before the operation is acknowledged back to the application client as successful. In summary, the latency of write operations increase only when either or both of the following actions are taken:

  1. increase the geographical distance between the nodes hosting the 3 replicas

  2. increase the RF from say 3 to 5 (which would lead to 3 out of 4 replicas needing to commit the write before client ack).

Is it possible to instead favour speed and have an eventually-consistent model across the nodes instead?

Eventual consistency is not possible in Yugabyte DB given the reliance on per-shard distributed consensus (using Raft as the consensus protocol) among nodes that process write requests. You can review how Raft differs from eventual consistency in this post How Does Consensus-Based Replication Work in Distributed Databases? under the "Paxos/Raft vs. Non-Consensus Replication Protocols" section. As highlighted in the previous answer, when cross-region write latency is a concern, the solution is to use asynchronous replication across regions using either Read Replica Clusters (for powering timeline-consistent reads in regions faraway from the regions taking write requests) or Multi-Master Clusters (for powering writes in multiple regions with conflicting write requests resolved through Last Writer Wins).

Upvotes: 2

Karthik Ranganathan
Karthik Ranganathan

Reputation: 687

Thanks for your interest in Yugabyte DB! This is definitely a great use-case. Please see the answers inline.

I’m interested in global replication with YugaByte. At the moment I have built an abstraction over BadgerDB, a Key-Value database written in GoLang. Our abstraction maintains indexes, is kind of graphql-ish, and extremely FAST. Is it possible to use YugaByte DB with global replication as a Key Value store, instead of GoLang? I’m aiming for the performance of KeyValue, Globally Distributed.

Yes, you can absolutely achieve a high-peformance, globally distributed key-value deployment with Yugabyte DB. You can see an example of a globally distributed deployment here.

As I understand the speed of writes decreases with each additional replicated node. Is that right? Is it possible to instead favour speed and have an eventually-consistent model across the nodes instead?

As a general rule, yes, latency increases with replication factor. Replication factor is primarily meant to improve fault tolerance, but looks like you want to serve reads close to the end user. In this scenario, you have two options:

  • Read Replicas are a read-only extension to the primary data in the cluster. In this scenario, the primary data of the cluster is deployed across multiple zones in one region, or across nearby regions. Read replicas do not add to the write latencies since the write does not synchronously replicate data to them - the data gets replicated to read replicas asynchronously. You can write to a replica, but the write gets internally redirected to the source of truth.

  • Multi-master deployments are currently being released as a part of our 2.0 version in beta. This feature allows independent clusters to replicate to each other with last writer wins semantics. Here is a detailed design doc about multi-master deployments.

Assuming you want global reads and a single cluster, I think read replicas may be what you are looking for.

So we need an authentication layer on the server between YugaByte and the client, ideally this layer would provide the same abstraction we currently have written in Go.

Yes, Yugabyte DB supports authentication and RBAC for authorization in the Go client drivers.

What about load-balancing between nodes routing requests to the closest geographical location?

The YCQL API currently supports reading from the nearest geographic region today, so you should be able to achieve this easily already. The YCQL API is semi-relational, but for purposes of a key-value app, that should suffice plenty!

Hope that helps, and let me know if you have any further questions!

Upvotes: 3

Related Questions