Reputation: 1193
I have a heuristic of the following format:
if a == 1 and b == 1 and c == 1:
do something
if a == 1 and b == 1 and c == 2:
do something
if a == 3 and b == 2 and c == 1:
do something
if a == 2 and b == 2 and c == 1:
do something
if a == 3 and b == 1 and c == 3:
do something
etc.
Of course, this makes many unnecessary comparisons which could be avoided if the statements were nested like this:
if a == 1:
if b == 1:
if c == 1:
do something
if c == 2:
do something
etc.
Assuming the set of tuples (a, b, c)
for which I have a case is finite, and each one has the same likelihood of being received by the algorithm, how can I generate a decision tree which is optimal, i.e., that it makes the least number of comparisons for the general case, or that minimizes the total number of comparisons when all the inputs are run through it?
I imagine something like this:
In: optimal_tree([(1, 1, 1), (1, 1, 2)])
Out: "if a == 1:
if b == 1:
if c == 1:
do something
if c == 2:
do something"
In: optimal_tree([(1, 1), (2, 1), (2, 2)])
Out: "if a == 2:
if b == 1:
do something
if b == 2:
do something
if a == 1:
do something"
OR
"if b == 1:
if a == 1:
do something
if a == 2:
do something
if b == 2:
do something"
Upvotes: 1
Views: 221
Reputation: 27312
Rule engines and database queries regularly deal with this problem too. You might want to look into the implementation behind those.
They have several ways to deal with this (although none is perfect):
If you want to make your algorithm faster, you might want to look into hashing & indexing if you're not doing that already. That brings in a much bigger scale advantage.
Upvotes: 1