StackOverFlow
StackOverFlow

Reputation: 4614

Spring boot - How to Improve api response time for Application with monthly data refresh

We are using spring boot 1.5.10 Release version along with mongodbrepository.

We have huge data so we are loading static data(select * from table) on server startup using postconstruct.

Api response size is Approx 25MB we are compressing it using gzip so size becomes 5MB.

We have multiple services, every service includes @PostConstruct to load (select * from table)frequently used data to prepare maps to improve performance.

We are preparing map<id, custom Obj> from findall db query in each service 1 service for reference..

@PostCounstruct Init(){ List list = xyzRepository.findAll(); Map<String, MyObj> map = new HashMap(): for(MyObj Obj : list){ map.put(obj.id, Obj) } }

User will get fast response from map when findById / findAll / findByList(List employees> emps) called

Every month our db gets refreshed/updated. Once we refreshed db we are restarting spring boot application.

Problem which we are facing if there is any db refresh happened ( due to some issue/delay) within month in database, we need to restart server to get correct data instead of stale data.

We tried @Cacheable in every service but 1st db hit taking too much time.( As we are doing select * from table).

We made parallel ajax call to get data. Application dashboard need to plot data which is approx 30MB (gzip 5mb) which is pain..

Approx 2000 users for application 20 services.. Each service calling to get data ( some are getting static data from map which is initialised in postcounstruct)

Currently We are restarting server monthly when db refresh happened.

How we can get latest data without server restart when there is unexpected db refresh?

---------[Edit-1]-----

Upvoted answer suggest that populate new/update data in every map present in every service.

1.Application startup time will be high as map will be populated from database.

2.Overhead to populate/maintain data in some time interval(nightly/alternate day/Weekly) in Java maps.

What if more data will increase in each collection/table & every month.

Need expert review on upvoted answer.


Thanks for reading question 😊

Upvotes: 3

Views: 22253

Answers (5)

Amit Vyas
Amit Vyas

Reputation: 790

I had a similar business problem where we get each month's data to refresh and our deployment in Kubernetes based on so same service multiple pods as spring-boot service.

My Approach 1:

  1. Maps based in-memory cache objects based on data requests and responses to be generated.
  2. Maps generated on the application start event.
  3. Spring-based @Scheduled task to check DB for new data availability each night.
  4. In the database having a table to provide information if new month data is added so step 3 keeps checking that table periodically if new data then only it initiates the cache rebuilding.
  5. Extra checks of sampling of data to prevent the condition where the data population in dB is completed before the cache rebuild starts.
  6. Rebuild the cache if till step 5 everything is successful.
  7. If data keep getting large need to move the in-memory cache to separate cache servers like Redis or Memcache.

This is will make startup time a bit high but that all depends on data points of request and response. Data points need to be carefully used as per the business need not like cache everything. But it will never force to have a manual startup of the server.

For all cashing queries using the JDBC query approach in place of hibernate approach takes more time/memory whereas the JDBC template made query execution faster.

All caching queries need to be tested for execution plan too so it can tell any missing indexes before moving to production.

My Approach 2:

Approach 1 logic but all in separate app.

Another performance way can be taking out caching logic as a separate app and deploy them in their own pods and use Kubernetes scheduler as crontab which will rebuild the cache as defined above and service will call these pods to cater then request/response from the cache. This approach will offload the service memory overhead due to the cache.

Having in-memory caching as a separate app will give the advantage on my first approach is that we can have a cache swap so till the new cache we build service will be catering requests from old cache, the user experience will not have any impact.

Upvotes: 1

Ja&#39;afar Naddaf
Ja&#39;afar Naddaf

Reputation: 402

I suggest doing the following:

  • Annotate response methods with @Cacheable("#yourKey"). Also, remember to add @EnableCaching to the application.
  • Call the cached response method from within the application after it starts to reduce the time consumed for next calls.
  • Create a "refresh" endpoint/method to be called to refresh the DB and annotate it with @CacheEvict(allEntries = true, key="#yourKey").

Maybe you can even add @Scheduled to your refresh endpoint/method to make it run automatically every specified period of time. Also, remember to add @EnableScheduling to the application.

For the DB, I think indexing the necessary tables could be helpful if not done already.

Upvotes: 0

Alexander Pavlov
Alexander Pavlov

Reputation: 2210

You can introduce some interface

interface Refreshable {
    void refresh();
}

All beans which do data caching during post construct should implement it

@Component
public class SomeDataProvider implements Refreshable {
   ...

   @Override void refresh() { /*here refresh data*/ }

   @PostConstruct
   public void postConstruct() {
       ...
       refresh();
       ...
   } 
}

and now expose rest endpoint which you can call whenever database is changed

@RestControler
public class ForceRefresh {

   @Autowired 
   private List<Refreshable> refreshables; // here Spring will inject all services which can be refreshed

   @PostMapping
   public void forceRefresh() {
       // refresh concurrently using common thread pool
       refreshables.stream().parallel().forEach(Refreshable::refresh);
   }

}

Alternatively, instead of REST endpoint you can implement nightly reload, see Spring docs for @ @EnableScheduling and @Scheduled.


As a side note - using @PostConstruct for loading is NOT an optimal approach because Spring loads beans in single-threaded mode. Better to implement ApplicationReadyEvent listener, inject a list of Refreshables as in example above and load data asynchronously using thread pool (==utilizing whole power of multiple CPUs on your server and on Mongo).

Upvotes: 6

Katy
Katy

Reputation: 1157

Since MongoDB doesn't offer any mechanism to call your application back if any event happens, you have alternative solution by creating a Schedule Tasks with Spring Boot. The task will update the data cached based on a frequency you define. You can keep the postcontruct method for the first start-up of you application.

Another point, you can improve the performance of your queries by adding indexes. (Maybe, this was done)

Upvotes: 0

Bav
Bav

Reputation: 139

I think to solve this kind of problem, you need to build a matrix of your API by any monitoring service because we do not have a full image.

  • I suggest you check any monitoring tool (Prometheus or Datadog)
  • APM (Application performance motoring) like Dynatrace It will help you to check every API request & your SQL queries what is happening to them the response time.
  • want to add also redis for caching inorder to make only 1 hiy of a query & if you have any problem happened in database, you could check al messages in the queue.

https://redis.io/topics/client-side-caching

After that, you could check where is the issue exactly from the code of API service or from the database itself.

Upvotes: 0

Related Questions