JSONStatham
JSONStatham

Reputation: 373

noSQL rollback feature

I'm new to noSQL technologies, and I was surprised there is no transaction support whatsoever. My main problem is when i make some of our insert task, that insert consist of ~5 seperate insert. We have to find a document by 4 different IDs. The problem is that the document is fairly large, and it's really expensive to store it like this:

So we come up with an internal Id, that points to the document. Yes, I know this design somewhat violates the whole noSQL concept, but it saves a lot of memory. If document insert fails, the Ids have no meaning, and should be removed. Is it a good idea to write my own rollback handling, keep track of successful inserts/updates? Or the whole concept is wrong?

Upvotes: 3

Views: 2719

Answers (3)

Aaron
Aaron

Reputation: 57758

I know this design somewhat violates the whole noSQL concept, but it saves a lot of memory.

That is a very 1970's way of thinking. Relational database theory originated at a time when disk space was expensive. In 1975 IBM was selling hard drives at $11k per megabyte. By 1980 prices dropped so that you could buy a gigabyte's worth of storage space for under $1 million. Today, you can go on NewEgg and buy a terabyte drive for $60. Now disk space is cheap, and processing time is the expensive part.

In non-relational (NoSQL) data modeling, you should build your table structures according to how it makes sense to query your data. This is a departure from relational data modeling, where you build your tables according to how it makes sense to store your data. Often times, query-based modeling results in storage of redundant data...and that's ok. Duplicate data for speed, reference data for integrity.

Is it a good idea to write my own rollback handling, keep track of successful inserts/updates? Or the whole concept is wrong?

I was on a Cassandra project where we did implement something similar to an application-side transaction/rollback. It really didn't work very well, and ended-up creating several tombstones. Ultimately, I would ask yourself exactly why your application needs a non-relational database, because it sounds like you still need some of the benefits of a relational database. If you're sure that you absolutely need a non-relational database, then you may want to re-think your approach to data modeling.

Upvotes: 5

Helipilot50
Helipilot50

Reputation: 227

If you were using a NoSQL database, like Aerospike, that supports Time To Live (TTL), one approach to take would be to write your document record first with a short TTL, then write your mapping records that map external ID --> internal Id.

On success of all your mapping records, touch the document record and update the TTL on the document to suit your use case.

On failure of your external ID --> internal Id records, simply allow the short TTL to expire and the database will clean it up.

This is a partial 2PC approach. It will work if your database latencies are small (under 5ms). It may be good enou

This kind of algorithm is used successfully in the AdTech industry where they are mapping and external ID (like a cookie or mobile device ID) to and internal user profile.

Upvotes: 3

Don Branson
Don Branson

Reputation: 13709

Getting away from transactions, the move from ACID to BASE, has huge benefits in terms of flexibility and tunable scalability. Dan Pritchett spoke about their re-thinking of the need for transactionality, for the limited context where transactions are really needed. Read Amazon's Dynamo whitepaper, which explains the benefits of a transactionless database.

There is a mindset change. If you try to use a post-relational database using relational patterns, you'll be unhappy with the result, in the same way that people were unhappy with relational databases when applying patterns learned with pre-relational databases.

Upvotes: 2

Related Questions