Reputation: 27969
Our backend is written in Kotlin
. The data is in MongoDB
.
We did some profiling and this revealed that the current bottleneck
is that too much data gets transferred between MongoDB and the Kotlin
backend.
get_by_id()
fetches the same data again and again.
We thought about caching all get_by_id()
calls in an in-memory cache
(shared by all threads of this node). This way all threads on a node can benefit from the faster access to data from this cache.
The next step would be to implement cache-invalidation
. All modifications would need to update the in-memory
cache.
Before implementing this, I want to know which different/better ways exist to implement this.
How to optimize the fact that the code does fetch the same data from MongoDB again and again?
Upvotes: 3
Views: 341
Reputation: 1016
There are many ways you can optimize your backend to minimize data transfer to your database system.
As you have mentioned in your question, an in-memory cache would help out your bottlenecked back-end a lot. Using a performance-optimized key-value store like redis can be extremely useful when used appropriately. By only refreshing the cache once you need to you can have your caching system take a lot of load off of your database and make requests to your back-end a lot faster. However, implementing cache validation in a working system might not be so easy. An advantage of redis would be that, due to its NoSQL nature, it would perfectly harmonize with MongoDB. You might want to take a look at hash tables/stores in redis, they are very similar to a JSON document (though they can only be one layer deep/have no nested objects). You could also store JSON as a string value in a key if that better suits your needs. Another feature of redis is automatic cache expiration, so you can set an expiration time for easier cache control.
Although it's difficult to tell from your question, by optimizing the way users authenticate and data is stored on the client requests to the backend in general and thereby traffic to your database can significantly be reduced. For example, by using JWT Tokens as authentication mechanism, you can store user data with the token, allowing the client to keep data across pages and, more importantly, without making any additional requests to the backend. These tokens can't be manipulated since they have a secure signature only your server knows, but they can be read in from anybody; so if you're storing sensitive data, it might be better to shift that somewhere else. Like redis cache keys, JWT tokens can expire after a set amount of time, but they remain readable indefinitely.
MongoDB, like many other database systems, offers integrated caching and performance optimization out of the box by indexing certain entries. Whilst redis might be a better long-term solution, properly indexing your database can make a huge difference on performance, the reason why many companies have developers dedicated to optimizing table layouts, structures and indexing.
Especially with a NoSQL database like MongoDB, efficient querying is key. Joins are pretty heavy workloads for such a database system, so looking into the querys you make and the logic behind them might be helpful. By optimizing existing queries and circeling out the ones that are redundant, database load could be minimized as well.
Whilst a JSON-based document store system like MongoDB has its perks, misusing the comfort of having no guidelines on how data should be structurized often results in a lot more requests because most needed data is not made most easily accessible. Again, it's difficult to tell from your question, but take a look at your queries and your document structure (if you have one): do the requests feel natural? Do they make sense? Or is the current structure obstructive?
As my last recommendation, you should try to move as much non-sensitive and non-performance intensive logic onto the client side (a trend modern frameworks seem to follow) to relieve pain off of your database and have clients not request data over and over again. Especially if you're running a web app, this can be a main source of unnecessary traffic.
Upvotes: 3