chaosguru
chaosguru

Reputation: 1983

Executing Drools Rules Engine on DB for remove duplicates

The question I have is not around drools or rules engine but I have a specific usecase where I am planning to use rule engine. I read through all the questions and googling but was unable to get the best fit. Below is my query

we have a crawler engine which pumps the data in DB. since data is huge we often have fewer duplicate entries. Currently the rules are tightly bound in DB Tables and using complex queries. I thought of having rules engine which would perform on top on the table but I am unable to attain it via rules. Am I missing anything ? or my understanding is wrong.

  1. First question is , using rules engine a right approach?
  2. Second is , If rules can be used. I could not find an approach on firing rules on Array.

Questions may be naive but I still am not finding a solution.

Upvotes: 0

Views: 719

Answers (1)

Gergely Bacso
Gergely Bacso

Reputation: 14661

For what you described above Drools is really not a good fit. However depending exactly what you try to achieve you may find it useful. Instead of removing duplicates you can use a rule engine to prevent the insertion of duplicates. To achieve that you need to have a stateful session with your existing record-set in it, and you can write your own evaluation rules in Drools to mark incoming entries as duplicates. As a result of your execution you can decide whether this new entry should be saved or discarded as duplicate. What you should consider:

  • Do you want to invest that much time/effort into this task?
  • Do you really need a rule-engine, for example do you expect frequent changes in the validation logic?
  • Would there be any issue with the performance of Drools? (the volume of your data or the frequency of the incoming records might exceed the capabilities of Drools)

Upvotes: 0

Related Questions