Reputation: 1541
I want to execute a single stateful / stateless drool session ? Assuming default thread pool size of Spring Boot. How shall I factor in the server processing power ?
The rule is as simple as :
when
Apple( color == red )
then
System.out.println("hello");
end
What is the maximum number invocations per second that a rule engine can support? How do I arrive at the calculation ?
Upvotes: 1
Views: 4100
Reputation: 15219
This depends on too many factors to say concretely "You need X resources per Y rules." Yes, rule count is (somewhat) important, but other things are important as well. What I have always done, and recommend you do as well, is allocate what you think is reasonable, run a stress test for what you think will be your peak load -- and then double that (or some other reasonable multiple.) Send concurrent requests. Send complicated data. Send simple data. Send data that actually looks legitimate. Make sure your logging solution (if relevant) doesn't affect your rule execution. Once you have observed that behavior you can adjust your production configuration.
Generally speaking, you're going to see resource usage in the following ways:
Working memory. When you pass objects into the rule engine for execution, those objects are serialized and then deserialized. This is the primary consumer of heap -- the serialized/deserialized objects you pass into the rules. Note that if you have rules that create new objects and add them to meory -- those will also increase your heap usage.
I mentioned serialization/deserialization explicitly because you should confirm that your objects do serialize/deserialize cleanly and properly. I once worked on a project where we had an object that we were passing into Drools that was ~20kb but exploded to close to 1.2mb once serialized. We couldn't figure out why we kept running out of memory when we were passing in so little data, but eventually we realized that it wasn't the data, it was the structure we were using that was serializing poorly.
Reevaluation and loops. Rule execution takes CPU. This seems obvious, but there are particular Drools workflows which increaset his CPU usage. If you ever call modify
, update
, or insert
, this causes partial or complete reevaluation of the rules. So if your rules, when executed once, consume X amount of CPU, but there are rules that include these particular calls, then that X is not necessary sufficient because one execution might spawn multiple others.
This is important to keep in mind because CPU is one of the primary indicators -- along with dead threads -- of an infinite loop in your rules execution. If you see a sudden spike in CPU and it gets pegged there (and that thread becomes unresponsive), that's an indication that you likely have your rules designed in a way that the input object has just triggered an infinite loop -- eg. rule A triggers and updates the data such that rule B triggers and updates the data such that rule A triggers, and so on.
On the other hand if you periodically see spikes in CPU that go back down, this is indicative of times when your rules are passed complex data such that it takes the rules engine a lot of "effort" to do their computation. The inputs might be triggering a large number of rules, or doing a lot of updates/modifications.
If you observe that your rules are "slow", it's likely they don't have enough CPU.
These are the basic ongoing uses of resources. That is, when you have a server, the rules are loaded, and you're serving requests and passing them in/out of the rules, this is generally where you'll be attributing your resource consumption to.
However it is important to consider also that there is a one-time* hit for some framework related tasks that Drools executes.
I used to maintain a microsystem with 48,000 rules (ballpark) that processed about 15 requests per second was allocated used 16 gb memory and 4 gb CPU. These rules never changed the data in working memory -- it was a simple pass through. However it did take ~10 minutes to load those rules on startup. (I will note that while this was a Spring application, it was Spring 2. No Boot for me. :( ...)
Conversely I had the ... misfortune of also working on a monolith of an application that had about 1000 rules. This monster of an application sat on well over 120 gb heap and (last I was involved) about 64 gb CPU. These rules were very old (10 years or so) and were constantly calling update
(which reevaluates everything from the top; it's the equivalent of calling the rules a second time with the updated information in working memory ... the alternative is insert
which is only a partial re-evaluation.) These rules were also passed massive quantities of data (much of it unnecessary) in complex data structures. There were times when this application would consume most or all of its resources and slow down to near unresponsive because it couldn't handle the load. There were times when I was called in to identify the looping rules caused by the 'update' calls.
Basically what I'm saying is that the resources necessary are dependent on the characteristics of your application and the implementation of the rules themselves. The best way to determine the appropriate allocation is to perform stress tests to figure out your initial application, and then monitoring. Keep an eye on your transaction volumes -- as you grow and have more clients or are servicing more requests you'll need to either increase your allocation or redo your stress tests to make sure you always have sufficient resources to handle your peak and then some.
If you promote good rule design (minimize calls to update, etc.) and make sure only to pass the data into the rules that is actually needed, you can usually scale your resource consumption with load in a linear fashion with a relatively mild slope. Unfortunately this is one of those things where saying "X resources per Y rules" isn't really something that is possible due to too many variables.
* -- I say "one time" hit for things like loading the rules, but it's really one time per load. If you load up your rules once on application start, it's a one time hit. But if you're reloading your rules after they change or periodically or whatever, you'll have to account for this during those times a well.
Upvotes: 9