Reputation: 113
We have been using Drools engine for a few years now, but our data has grown, and we need to find a new distributed solution that can handle a large amount of data. We have complex rules that look over a few days of data and that why Drools was a great fit for us because we just had our data in memory.
Do you have any suggestions for something similar to drools but distributed/scalable?
I did perform a research on the matter, and I couldn't find anything that answers our requirement.
Thanks.
Upvotes: 4
Views: 8286
Reputation: 11
It seems like Databricks is also working on Rules Engine. So if you are using the Databricks version of Spark, something to look into.
https://github.com/databrickslabs/dataframe-rules-engine
Upvotes: 1
Reputation: 3070
Take a look at https://www.elastic.co/blog/percolator
What you can do is convert your rule to an elasticsearch query. Now you can percolate your data against the percolator which will return you the rules that match the provided data
Upvotes: 0
Reputation: 457
Spark provides a faster application of Drools rules to the data than traditional single-node applications. The reference architecture for the Drools - Spark integration could be along the following lines. In addition, HACEP is a Scalable and Highly Available architecture for Drools Complex Event Processing. HACEP combines Infinispan, Camel, and ActiveMQ. Please refer to the following article for on HACEP using Drools.
You can find a reference implementation of Drools - Spark integration in the following GitHub repository.
Upvotes: 2
Reputation: 317
Maybe this could be helpful to you. It is a new project developed as part of the Drools ecosystem. https://github.com/kiegroup/openshift-drools-hacep
Upvotes: 1
Reputation: 29195
In the first place, I can see for huge voluminous data as well we can apply Drools efficiently out of my experiences with it (may be some tuning is needed based on your kind of requirements). and is easily integrated with Apache Spark. loading your rule file in memory for spark processing will take minute memory... and Drools can be used with spark streaming as well as spark batch jobs as well...
See my complete article for your reference and try.
Alternative to it might be ....
JESS implements the Rete Engine and accepts rules in multiple formats including CLIPS and XML.
Jess uses an enhanced version of the Rete algorithm to process rules. Rete is a very efficient mechanism for solving the difficult many-to-many matching problem
Jess has many unique features including backwards chaining and working memory queries, and of course Jess can directly manipulate and reason about Java objects. Jess is also a powerful Java scripting environment, from which you can create Java objects, call Java methods, and implement Java interfaces without compiling any Java code.
Try it yourself.
Upvotes: 0