Reputation: 117

Progress-4GL - What is the fastest way to summarize tables? (Aggregate Functions: Count, Sum etc.) - OpenEdge 10.2A

We are using OpenEdge 10.2A, and generating summary reports using progress procedures. We want to decrease the production time of the reports.

Since using Accumulate and Accum functions are not really faster than defining variables to get summarized values, and readibility of them is much worse, we don't really use them.

We have tested our data using SQL commands using ODBC connection and results are much faster than using procedures.

Let me give you an example. We run the below procedure:

DEFINE VARIABLE i AS INTEGER NO-UNDO.

ETIME(TRUE).
FOR EACH orderline FIELDS(ordernum) NO-LOCK:
    ASSIGN i = i + 1.
END.
MESSAGE "Count = " (i - 1) SKIP "Time = " ETIME VIEW-AS ALERT-BOX.

The result is:

Count= 330805
Time= 1891

When we run equivalent SQL query:

SELECT count(ordernum) from pub.orderline

The execution time is 141.

In short, when we compare two results; sql time is more than 13 times faster then procedure time.

This is just an example. We can do the same test with other aggregate functions and time ratio does not change much.

And my question has two parts;

1-) Is it possible to get aggregate values using procedures as fast as using sql queries?

2-) Is there any other method to get summarized values faster other than using real time SQL queries?

Upvotes: 2

Answers (4)

José Francisco Peña Ordaz

Reputation: 1

open query q preselect each EACH orderline no-lock. message num-results("q") view-as alert-box.

Upvotes: 0

idspispopd

Reputation: 404

Some additional suggestions:

It is possible to write your example as

DEFINE VARIABLE i AS INTEGER NO-UNDO.

ETIME(TRUE).
select count(*) into i from orderline.
MESSAGE "Count = " (i - 1) SKIP "Time = " ETIME VIEW-AS ALERT-BOX.

which should yield a moderate performance increase. (This is not using an ODBC connection. You can use a subset of SQL in plain 4GL procedures. It is debatable if this can be considered good style.)

There should be a significant performance increase by accessing the database through shared memory instead of TCP/IP, if you are running the code on the server (which you do) and you are not already doing so (which you didn't specify).

Upvotes: 1

Tom Bascom

Reputation: 14020

The 4gl and SQL engines use very different approaches to sending the data to the client. By default SQL is much faster. To get similar performance from the 4gl you need to adjust several parameters. I suggest:

-Mm 32600                  # messages size, default 1024, max 32600
-prefetchDelay             # don't send the first record immediately, instead bundle it
-prefetchFactor 100        # try to fill message 100%
-prefetchNumRecs 10000     # if possible pack up to 10,000 records per message, default 16

Prior to 11.6 changing -Mm requires BOTH the client and the server to be changed. Starting with 11.6 only the server needs to be changed.

You need at least OpenEdge 10.2b06 for the -prefetch* parameters.

Although there are caveats (among other things joins will not benefit) these parameters can potentially greatly improve the performance of "NO-LOCK queries". A simple:

FOR EACH table NO-LOCK:
  /* ... */
END.

can be greatly improved by the use of the parameters above.

Use of a FIELDS list can also help a lot because it reduces the amount of data and thus the number of messages that need to be sent. So if you only need some of the fields rather than the whole record you can code something like:

FOR EACH customer FIELDS ( name balance ) NO-LOCK:

or:

FOR EACH customer EXCEPT ( photo ) NO-LOCK:

You are already using FIELDS and your sample query is a simple NO-LOCK so it should benefit substantially from the suggested parameter settings.

Upvotes: 4

Jensd

Reputation: 8011

The issue at hand seems to be to "decrease the production time of the reports.".

This raises some questions:

How slow are the reports now and how fast do you want them?
Have running time increased compared to for instance last year?
Has the data amount also increased?
Has something changed? Servers, storage, clients, etc?

It will be impossible to answer your question without more information. Data access from ABL will most likely be fast enough if:

You have correct indexes (indices) set up in your database.
You have "good" queries.
You have enough system resources (memory, cpu, disk space, disk speed)
You have a database running with a decent setup (-spin, -B parameters etc).

The time it takes for a simple command like FOR EACH <table> NO-LOCK: or SELECT COUNT(something) FROM <somewhere> might not indicate how fast or slow your real super complicated query might run.

Upvotes: 1

Progress-4GL - What is the fastest way to summarize tables? (Aggregate Functions: Count, Sum etc.) - OpenEdge 10.2A

Answers (4)

Related Questions