Reputation: 105
So I have been preparing and giving out interviews. I got asked this question in 2 of the interviews and I couldn't come up with a satisfying answer or maybe not what they wanted to hear.
The question is, let's keep aside the various operations techniques like load balancing, multiple instances, database replication and stuff, what changes you can do in your application i.e the REST API to make it capable of handling huge amount of requests?
What I've thought of it till now is that we can make any database or other API calls async to let them run in separate thread in the background so that the processing can move on to other requests. One of my colleagues said to use cache to minimize on database calls.
And maybe increase the size of thread pool but threads are expensive and there's only so many threads you can create. Also, if all the threads in pool are busy, the other requests are blocked until a thread gets available. So, that really doesn't even seem to be classified as a approach to this problem.
Overall, we concluded that in that case there's not much really that we can do other than to make sure that the API should only be performing light-weight operations, if that makes any sense.
I googled for the same but didn't really found anything much other than operations stuff and thread pooling.
I was wondering if the community could provide their input on this. How'd you handle such scenario?
Upvotes: 8
Views: 14053
Reputation: 3188
There are many things you can do to have the best performance in spring here are a few.
Reactive
Utilize a reactive non-blocking thread model. This is very beneficial when your application acts as a pass-through, basically if you receive a rest request, then your application needs to make a rest request to a 3rd party application, utilizing a reactive model will allow those threads to be re-used while the request is processing at the external service.
Statelessness
Make your services stateless
Security
Use a JWT rather than an Oauth token that needs to perform a database lookup every time.
Caching
Cache everything that you can. Use local caching rather than distributed caching wherever possible
DB
Use a fast connection pool like Hikari Ensure that your connection pool has the optimal configuration. Optimize queries. Faster response times means more threads available for processing.
Thread management
Correctly configuring the threadpool used for an executor and make a big difference in application performance. However, this configuration must be based on the available resources of you system and the needs of the application.
Microserice Architecture
If you are correctly using a microservice architecture, each service would only receive requests relating to its domain, this could reduce the needs of the application as a whole.
Server Selection
Use an Undertow or Netty embedded server
Optimize JVM
Optimize JVM memory usage for your application or container.
Upvotes: 10
Reputation: 1378
The first question that comes to mind is What types of request? To simplify let's say there are two profiles, read-intensive or write-intensive and elaborate on both.
If the app is read-intensive:
Invert most of your effort in caching. You can cache at several levels, at the REST level caching the response completely. Also cache at the service and repository levels depending on the consistency-level you are targeting.
If all the request are for the same keys
, and your objects are rather small to fit in the memory of the app you can get away with a local solution, otherwise you would need a cache service provider were to offload the caching, something like Redis, Memcached, or the like. I rather not encourage any in particular as they change in the preference of the community from time to time.
The reason behind this approach is that reading from L2 (or even main memory) is orders of magnitude faster than reaching for the db (when actually reads from disk). See more times here
if the app is write-intensive: or the above approach is not enough, given that even if you are caching you have to fill this cache sometime, then it's time to move to a no-sql/distributed database like MongoDB, Cassandra, ElasticSearch and the like, depending on your CAP requirements. Some examples here. These databases are designed specially for high throughput. You can also check their equivalents on AWS. Probably some de-normalization has to be done on your data to avoid expensive (or plainly not supported) joins.
I'm not sure if changing the db was in scope on these questions, but caching and DBs are usually where the biggest improvements in performance can be achieved.
Good luck next time!
Upvotes: 7