Mikhail T.
Mikhail T.

Reputation: 4017

Sleeping for a random amount of time in a stored procedure

A vendor application we're using utilizes multiple processes (not threads) spread across multiple computers, one for each processing core available. When asked for a batch of calculations, the application distributes the load between these engines, which then go to the database for additional details necessary to perform the computations.

As we add more cores -- and engines -- the simultaneous rush to the database causes a load spike so big, some engines start getting the ORA-12170: TNS:Connect timeout occurred error, which causes them to crash (I know, it shouldn't, but we cannot fix the vendor's code). For example, with the engine-count of 1386, about 300 engines dump cores, because the 16-core Oracle cannot cope...

There is no official way to cause the application to stagger these engine-initializations...

But the first thing each engine is doing is invoke the same stored procedure -- which updates the same table -- and I'm wondering, if I can alter it to perform the staggering: cause each engine to sleep for random number of milliseconds.

What would the SQL snippet for this look like? Our servers run Oracle-19...

Upvotes: 1

Views: 111

Answers (2)

Paul W
Paul W

Reputation: 11603

If you are timing out on the connection, you'll never get to where you can use PL/SQL to do the staggering. Here are some other ideas:

(1) configure your listener to support more connections and give it more time. In listener.ora:

INBOUND_CONNECT_TIMEOUT_LISTENER=110
MAX_ALL_CONNECTIONS_LISTENER=1000
MAX_REG_CONNECTIONS_LISTENER=1000

(2) disable any listener logging in listener.ora:

LOGGING_LISTENER=OFF
LOGGING_LISTENER_SCAN1=OFF
LOGGING_LISTENER_SCAN2=OFF
LOGGING_LISTENER_SCAN3=OFF

(3) create more listeners. Each one can only work so hard to fork a shadow process and copy its memory, so a connection storm will bottleneck on a single listener process. You can create as many as you need. I have a system with 16 of them, each listening on a different port. If you use RAC and are going through the SCAN listener, each of these can be registered and it'll round-robin through them. If you aren't using RAC, you can perhaps have your clients randomize the port they use.

(4) add randomized sleep on the client side. In Unix, something like:

sleep `echo "\`echo | nawk -v seed=$$ '{srand(seed);print rand()*10}'\`/1" | bc 2>/dev/null`

(5) disable anything in the database slowing down those connections. Logon triggers are the first thing to check, followed by auditing (disable the logon audit trail), followed by any contention on updating the last logon date in USER$, which you can override by setting _disable_last_successful_login_time=true in your initialization parms. All of these are removing unnecessary work that in an extreme login storm can cause a backup and timeouts.

I wish to underscore that all the above is for extreme cases, not for normal Oracle environments.

Upvotes: 1

MT0
MT0

Reputation: 168490

Within a PL/SQL block (or within your procedure), use DBMS_RANDOM to generate a random value and then DBMS_SESSION.SLEEP to sleep for that many seconds:

DECLARE
  seconds NUMBER := DBMS_RANDOM.VALUE(1, 20);
BEGIN
  DBMS_SESSION.SLEEP( seconds );
END;
/

fiddle

Upvotes: 4

Related Questions