Jijo Mathew
Jijo Mathew

Reputation: 352

Postgres query run executed using hibernate getting dropped if the query takes a long time without any timeout exception

I am running a postgres query that takes more than two hours. This query is executed using hibernate in a java program. After about 1.5 hours the query stops showing up in the server status in pg_admin.

Since, the query disappeared from the list of active queries on the database, I am expecting a success or a timeout exception. But, I get none.(No exception) and my thread in stuck in the wait state. I know the query has not finished because it was supposed to do some inserts in a table and I cannot find the expected rows in the table.

I am using pgbouncer for the connection pooling and the query_timeout is disabled. Had it been a hibernate timeout I should have got an exception. OS parameters on the DB machine and Client machine(Machine running java program)

Upvotes: 0

Views: 823

Answers (2)

Robins Tharakan
Robins Tharakan

Reputation: 2473

I am sure you would have already looked at the following resources:

  1. PostgreSQL Timeout Docs
  2. PgBouncer timeout (you already mention).
  3. Hibernate timeout parameters, if any.

Once that is done, (just like triaging permission issues during a new installation, ) I recommend that you try the following SQL, from different scenarios (given below) and ascertain what is actually causing this timeout:

SELECT pg_sleep(7200);
  1. Login to the server (via psql) and see whether this SQL times-out.
  2. Login to the PgBouncer (again via psql) and see whether PgBouncer times out.
  3. Execute this SQL via Hibernate (via PgBouncer), and see whether there is a timeout.

This should allow you to clearly isolate the cause for this.

Upvotes: 2

Jijo Mathew
Jijo Mathew

Reputation: 352

I found that the issue was caused due to the TCP connection getting dropped and the client still hanging waiting for the response.

I altered the following parameters at OS level:-

  • /proc/sys/net/ipv4/tcp_keepalive_time = 2700

Default value was 7200. This causes a keep alive check at every 2700 seconds instead of 7200 seconds.

Upvotes: 2

Related Questions