Minimizing inconsistency between tables in denormalized databases like Cassandra

Cassandra (and BigTable, etc) recommends a denormalized database, where tables are designed from the expected queries. The Cassandra doc uses this example:

hotels_by_poi:   poi_name (Key)
                 hotel_id (Cluster key)
                 name
                 phone
                 address

hotels:          hotel_id (Key)
                 name
                 phone
                 address

So name, phone, and address are denormalized between hotels_by_poi and hotels. What I'm wondering about is how to implement this method:

update_hotel_info(hotel_id, name, phone, address) {
    updateHotel(hotel_id, name, phone, address);
    updatePoisByHotel(hotel_id, name, phone, address);

}

It's possible the first method errors, or that the server running the two methods errors between the first and second update methods. Therefore the data gets out of sync. Without doing anything else, it's not even eventually consistent.

Is it possible to design multiple tables for eventual consistency of denormalized data shared between them?
Is this something people worry about in practice? (e.g., if the services are all 5 9's, then the integrity is 4 9s, pretty good.)

Upvotes: 1

Answers (2)

Manish Khandelwal

Reputation: 2310

As @Erick mentioned either use batch for maintaining consistency or if you can handle at client side by retrying failed inserts/deletes. For example

update_hotel_info(hotel_id, name, phone, address) {
    updateHotel(hotel_id, name, phone, address);
    updatePoisByHotel(hotel_id, name, phone, address);
}

You can retry update_hotel_info if either of the insert/update failed . This way you will get fast writes and you can use cheap writes of Cassandra.

Upvotes: 0

Erick Ramirez

Reputation: 16293

The idea is to wrap the related table updates in a CQL BATCH statement as I've explained here -- https://community.datastax.com/articles/2744/.

Even if you didn't use CQL batches, the idea is that if either of those methods fail, you should have error handling that would (for example) retry the request to make sure they're successful. Cheers!

Upvotes: 2

Minimizing inconsistency between tables in denormalized databases like Cassandra

Answers (2)

Related Questions