Paras Malik
Paras Malik

Reputation: 11

Splitting monolithic service to multiple services

I have a realtime(latency of 10ms) monolithic service that does multiple things in a single execution. I am facing a lot of problems because of monolithic architecture specially scaling team and maintaining complex code base. I have identified 3 different functional services and planning to split the monolithic service into 3 different services. But all of these services depend on the same data.

Since currently there is only 1 execution of code we only need to hit DB(currently its Redis) once in the call. After splitting there are 2 options

  1. Hit DB from all the 3 calls but that will add latency to the final service output and increase Hardware cost.
  2. Hit DB only from 1st service and pass that data to second and third service here problem is a lot of data needs to be passed to different services and makes service a lot more dependent on the first service.

Please share your experience on which way is better or if there is any better solution to this problem.

Upvotes: 0

Views: 745

Answers (1)

StoneyKeys
StoneyKeys

Reputation: 138

I don't think there's a magic bullet, but I'd recommend a few things to think about when cutting up a monolith.

First, beware of "Monolith split into microservices". Monolith and microservices are two different beasts, each needs it's own approach. Microservices can help you scale the team and cut down maintenance costs. But let's also remember that monolith architecture enabled you not to think about latency between modules or the size of objects you send between them. Changes in architecture needs to be accompanied by changes in used patterns, e.g.,:

  • Use messages instead of direct calls, use less synchronous operations, more parallel asynchronous tasks. Review your data flows.
  • Batching, batching, batching. Cross-service latency will kill your app if you handle entries one at a time.
  • Even less coupling. In a monolith, engineer tend to leave more coupling than needed, which means more (expensive) calls between services. Do these services actually need to know about each other?
  • Eventual consistency. In a monolith, engineers tend to keep all data and object states consistent. Once you're in microservice world, consistency if practically impossible - there's always a small delay. Thriving for consistency does not go well with performance. But what if you put the delay as a core principle? You might suddenly get a lot less dependencies between services. Here's a simple explanation of what eventual consistency means - Eventual consistency in plain English
  • Caching. Not much to add here - use caching :)

A different trap is "splitting logically". Business logic often provides a nice and easy-to-understand way to split your code. But that is not always the best latency-wise. Thus I suggest to think carefully about your data flows, before deciding on the exact cut lines. What are your minimal requirements for your data flows? Try drawing some kind of dependency graph and minimize it first before doing the actual split. Personally, I'd prefer an iterative approach by first spending some months (depending on project size) preparing the code - removing dependencies and reorganizing the code in would-be microservices. Review the split every month or so. If the split still looks good in 3-4 moths, they you probably got it right (enough).

There's probably a few more things to think about, but I assume you got my point - make sure you have the right architecture and appropriate patterns. But you may still end up with the initial question (pass data between services vs. fetch multiple times). If so, I still don't think there's a generic answer as there's too many variables:

  • Do you only care about performance, or also about (infrastructure) costs?
  • Do you need consistency? Passing data around might lead to stale data (e.g. data is updated before other service processes it)
  • Do all service need ALL data? Or maybe each service needs a small (different) chunk of the data that can be fetched faster?
  • How about a read-only DB replica available?
  • Do you pay for the traffic? Is there a throttle on the traffic (e.g. does it get slower if you send a lot of data)?

All that being said, it seems simpler design to fetch data directly from source in each service - from DB or calling a service to provide the data. Simpler designs may pay back with the lower maintenance costs :)

Upvotes: 2

Related Questions