Query with inequality in index using Progress

Question

What is the fastest way to query for records using an inequality in Progress 4GL? For example, if I need to find all the records whose state field doesn't match 'MI', how would I write that for best performance and why?

Various solutions I have been told include using a broader or different index and then using an IF statement, to avoid any use of inequality, such as:

FOR EACH record NO-LOCK:
  IF record.state = "MI" THEN NEXT.
  /*do stuff*/
END.

I've been told to avoid using NE statements, as they kill performance,

FOR EACH record NO-LOCK
  WHERE record.state NE "MI":
  /*do stuff slowly, apparently*/
END.

but I've also been told using OR is evil as well.

FOR EACH record NO-LOCK
  WHERE record.state = "WI" OR "AL":
  /*didn't write all 49 minus MI for space*/
END.

I've not been given substantive evidence for why any of these three would be superior, and there isn't sufficient data in my development environment to test with the actual situation I'm working on.

Tim Kuehn · Accepted Answer

It all depends on how well your query matches up with an available index.

Your first example does what is called a "table scan" - it'll look at every single record in the table before doing the IF to see if it's the one you want. Most of the time this is not what you want, particularly if the table is either large or frequently queried.

equals "=" is the most performant, particularly when there's an index on the field or fields you're querying.

"OR" can be evil if it's combined with an "AND" like so:

WHERE customer.AmountDue > SomeValue AND 
      (customer.state = "MI" OR customer.state = "WI").

The reason is that the db engine can't do any index lookup with the ORs, so it'll resolve the ">" operator and then check every record that matches the ">" to see if it matches either of the two states.

This can be fixed by refactoring the WHERE like so:

WHERE (customer.AmountDue > SomeValue AND customer.state = "MI") OR 
      (customer.AmountDue > SomeValue AND customer.state = "WI").

with this structure the db engine has two AND phrases it can resolve to a smaller set of results, merge the two lists together, and the end result is a single set of records for the query to traverse. This is much faster than using the OR in the first part of your question.

It all comes down to the query matching an index on the table you're querying. If there's an index that exactly matches what you're looking for, it'll go a lot faster than if there's an index that partially matches your query, or if there's no matching index at all.

What you need to do is check out some of the excellent presentations given at the different PUG Challenge conferences. You can find a presentation on index selection given at PUG Challenge Americas here:

pugchallenge.org/downloads2015.html

You can find presentations given at PUG Challenge EMEA here in the "prior events" tab of http://pugchallenge.eu

Good luck!

Query with inequality in index using Progress

Answers (2)

Related Questions