Reputation: 1898
I have read a book whose title is "Oracle PL SQL Programming" (2nd ed.) by Steven Feuerstein & Bill Pribyl. On page 99, there is a point suggested that
Do not "SELECT COUNT(*)" from a table unless you really need to know the total number of "hits." If you only need to know whether there is more than one match, simply fetch twice with an explicit cursor.
Could you anyone explain this point more to me by providing example? Thank you.
As Steven Feuerstein & Bill Pribyl recommends us not to use SELECT COUNT() to check whether records in a table exist or not, could anyone help me edit the code below in order to avoid using SELECT COUNT(*) by using explicit cursor instead? This code is written in the Oracle stored procedure.
I have a table emp(emp_id, emp_name, ...), so to check the provided employee ID corret or not:
CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE )
IS
v_rows INTEGER;
BEGIN
...
SELECT COUNT(*) INTO v_rows
FROM emp
WHERE emp_id = emp_id_in;
IF v_rows > 0 THEN
/* do sth */
END;
/* more statements */
...
END do_sth;
Upvotes: 11
Views: 27396
Reputation: 132680
There are a number of reasons why developers might perform select COUNT(*) from a table in a PL/SQL program:
In this case there is no choice: select COUNT(*) and wait for the result. This will be pretty fast on many tables, but could take a while on a big table.
This doesn't warrant counting all the rows in the table. A number of techniques are possible:
DECLARE
CURSOR c IS SELECT '1' dummy FROM mytable WHERE ...;
v VARCHAR2(1);
BEGIN
OPEN c;
FETCH c INTO v;
IF c%FOUND THEN
-- A row exists
...
ELSE
-- No row exists
...
END IF;
END;
DECLARE
v VARCHAR2(1);
BEGIN
SELECT '1' INTO v FROM mytable
WHERE ...
AND ROWNUM=1; -- Stop fetching if 1 found
-- At least one row exists
EXCEPTION
WHEN NO_DATA_FOUND THEN
-- No row exists
END;
DECLARE
cnt INTEGER;
BEGIN
SELECT COUNT(*) INTO cnt FROM mytable
WHERE ...
AND ROWNUM=1; -- Stop counting if 1 found
IF cnt = 0 THEN
-- No row found
ELSE
-- Row found
END IF;
END;
Variations on the techniques for (2) work:
DECLARE
CURSOR c IS SELECT '1' dummy FROM mytable WHERE ...;
v VARCHAR2(1);
BEGIN
OPEN c;
FETCH c INTO v;
FETCH c INTO v;
IF c%FOUND THEN
-- 2 or more rows exists
...
ELSE
-- 1 or 0 rows exist
...
END IF;
END;
DECLARE
v VARCHAR2(1);
BEGIN
SELECT '1' INTO v FROM mytable
WHERE ... ;
-- Exactly 1 row exists
EXCEPTION
WHEN NO_DATA_FOUND THEN
-- No row exists
WHEN TOO_MANY_ROWS THEN
-- More than 1 row exists
END;
DECLARE
cnt INTEGER;
BEGIN
SELECT COUNT(*) INTO cnt FROM mytable
WHERE ...
AND ROWNUM <= 2; -- Stop counting if 2 found
IF cnt = 0 THEN
-- No row found
IF cnt = 1 THEN
-- 1 row found
ELSE
-- More than 1 row found
END IF;
END;
Which method you use is largely a matter of preference (and some religious zealotry!) Steven Feuerstein has always favoured explicit cursors over implicit (SELECT INTO and cursor FOR loops); Tom Kyte favours implicit cursors (and I agree with him).
The important point is that to select COUNT(*) without restricting the ROWCOUNT is expensive and should therefore only be done when a count is trully needed.
As for your supplementary question about how to re-write this with an explicit cursor:
CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE )
IS
v_rows INTEGER;
BEGIN
...
SELECT COUNT(*) INTO v_rows
FROM emp
WHERE emp_id = emp_id_in;
IF v_rows > 0 THEN
/* do sth */
END;
/* more statements */
...
END do_sth;
That would be:
CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE )
IS
CURSOR c IS SELECT 1
FROM emp
WHERE emp_id = emp_id_in;
v_dummy INTEGER;
BEGIN
...
OPEN c;
FETCH c INTO v_dummy;
IF c%FOUND > 0 THEN
/* do sth */
END;
CLOSE c;
/* more statements */
...
END do_sth;
But really, in your example it is no better or worse, since you are selecting the primary key and Oracle is clever enough to know that it only needs to fetch once.
Upvotes: 22
Reputation: 36987
Before you take Steven Feuerstein's suggestions too serious, just do a little benchmark. Is count(*) noticeably slower than the explicit cursor in your case? No? Then better use the construct that allows for simple, readable code. Which, in most cases, would be "select count(*) into v_cnt ... if v_cnt>0 then ..."
PL/SQL allows for very readable programs. Don't waste that just to nano-optimize.
Upvotes: 1
Reputation: 5616
If you want to get number of rows in a table, please don't used count(*), I would suggest count(0) that 0 is the column index of your primary key column.
Upvotes: -2
Reputation: 89721
He means open a cursor and fetch not only the first record but the second, and then you will know there is more than one.
Since I never seem to need to know that SELECT COUNT(*)
is >= 2
, I have no idea why this is a useful idiom in any SQL variant. Either no records or at least one, sure, but not two or more. And anyway, there's always EXISTS
.
That, and the fact that Oracle's optimizer seems to be pretty poor... - I would question the relevance of the technique.
To address TheSoftwareJedi's comments:
WITH CustomersWith2OrMoreOrders AS (
SELECT CustomerID
FROM Orders
GROUP BY CustomerID
HAVING COUNT(*) >= 2
)
SELECT Customer.*
FROM Customer
INNER JOIN CustomersWith2OrMoreOrders
ON Customer.CustomerID = CustomersWith2OrMoreOrders.CustomerID
Appropriately indexed, I've never had performance problems even with whole universe queries like this in SQL Server. However, I have consistently run into comments about Oracle optimizer problems here and on other sites.
My own experience with Oracle has not been good.
The comment from the OP appears to be saying that full COUNT(*)
from tables are not well handled by the optimizer. i.e.:
IF EXISTS (SELECT COUNT(*) FROM table_name HAVING COUNT(*) >= 2)
BEGIN
END
(which, when a primary key exists, can be reduced to a simple index scan - in a case of extreme optimization, one can simply query the index metadata in sysindexes.rowcnt - to find the number of entries - all without a cursor) is to be generally avoided in favor of:
DECLARE CURSOR c IS SELECT something FROM table_name;
BEGIN
OPEN c
FETCH c INTO etc. x 2 and count rows and handle exceptions
END;
IF rc >= 2 THEN BEGIN
END
That, to me would result in less readable, less portable, and less maintainable code.
Upvotes: 1
Reputation: 65445
SQL Server:
if 2 = (
select count(*) from (
select top 2 * from (
select T = 1 union
select T = 2 union
select T = 3 ) t) t)
print 'At least two'
Also, don't ever use cursors. If you think you really really need them, beat yourself with a shovel until you change your mind. Let relics from an ancient past remain relics from an ancient past.
Upvotes: -1
Reputation: 28882
If two is all you are interested in, try
SELECT 'THERE ARE AT LEAST TWO ROWS IN THE TABLE'
FROM DUAL
WHERE 2 =
(
SELECT COUNT(*)
FROM TABLE
WHERE ROWNUM < 3
)
It will take less code than doing the manual cursor method, and it is likely to be faster.
The rownum trick means to stop fetching rows once it has two of them.
If you don't put some sort of limit on the count(*), it could take a long while to finish, depending on the number of rows you have. In that case, using a cursor loop, to read 2 rows from the table manually, would be faster.
Upvotes: 5
Reputation: 35236
This comes from programmers writing code similar to the following (this is psuedo code!).
You want to check to see if the customer has more than one order:
if ((select count(*) from orders where customerid = :customerid) > 1)
{
....
}
That is a terribly inefficient way to do things. As Mark Brady would say, if you want to know if a jar contains pennies, would you count all the pennies in the jar, or just make sure there is 1 (or 2 in your example)?
This could be better written as:
if ((select 1 from (select 1 from orders where customerid = :customerid) where rownum = 2) == 1)
{
....
}
This prevents the "counting all of the coins" dilemma since Oracle will fetch 2 rows, then finish. The previous example would cause oracle to scan (an index or table) for ALL rows, then finish.
Upvotes: 3
Reputation: 4133
Depending on the DB, there may be a sys table which stores an approximate count and can be queried in constant time. Useful if you want to know whether the table has 20 rows or 20,000 or 20,000,000.
Upvotes: 0