Reputation: 7526
I am trying to setup a HttpClient through the HttpClientBuilder. I also had a look at the HttpClientConnectionManager and here the confusion started.
On the ConnectionManager or more exactly the PoolingHttpClientConnectionManager there are methods to:
When is a connection considered expired?
When is it idle?
What happens when a connection from the pool is closed? Is it ensured, that there are connections recreated when needed?
Upvotes: 4
Views: 7647
Reputation: 703
According to: https://hc.apache.org/httpcomponents-client-4.5.x/current/tutorial/html/connmgmt.html#d5e418
HttpClient tries to mitigate the problem by testing whether the connection is 'stale', that is no longer valid because it was closed on the server side, prior to using the connection for executing an HTTP request. The stale connection check is not 100% reliable. The only feasible solution that does not involve a one thread per socket model for idle connections is a dedicated monitor thread used to evict connections that are considered expired due to a long period of inactivity. The monitor thread can periodically call ClientConnectionManager#closeExpiredConnections() method to close all expired connections and evict closed connections from the pool. It can also optionally call ClientConnectionManager#closeIdleConnections() method to close all connections that have been idle over a given period of time.
The difference between expired and idle is that an expired connection has been closed on the server side, while the idle connection isn't necessarily closed on the server side, but it has been idle over a period of time. When a connection is closed, it becomes available again in the pool to be used.
Upvotes: 4
Reputation: 12849
HTTP is based on TCP, which manages that packages are sent and received in the correct order and requests retransmissions if packages got lost mid way. A TCP connection is started with a TCP-Handshake consisting of SYN
, SYN-ACK
and ACK
messages while it is ended with a FIN
, ACK-FIN
, and ACK
series as can be seen from this image taken from Wikipedia
While HTTP is a request-response protocol, opening and closing connections is quite costly and so HTTP/1.1 allowed to reuse existing connections. With the header Connection: keep-alive
i.e. you tell your client (i.e. browser) to keep the connection open to a server. A server can have litterally thousands and thousands open connection at the same time. In order to avoid draining the server's resources connection are usually timely limited. Via socket timeouts idle connections or connections with certain connection issues (broken internet access, ...) are closed after some predefined time by the server automatically.
Plenty of HTTP implementations, such as Apaches HTTP client 4.4 and beyond, check the status of a connection only when it is about to use it.
The handling of stale connections was changed in version 4.4. Previously, the code would check every connection by default before re-using it. The code now only checks the connection if the elapsed time since the last use of the connection exceeds the timeout that has been set. The default timeout is set to 2000ms (Source)
If a connection therefore might not have been used for some time the client may not have read the ACK-FIN
from the server and therefore still think the connection is open when it actually got already closed by the server some time ago. Such a connection is expired and usually called half-closed. It therefore may be collected by the pool.
Note that if you send requests including a Connection: close
HTTP header, the connection should be closed right after the client received the response.
The state of open connections can be checked via netstat
which should be present on most modern operation systems. I recently had to check one of our HTTP clients which was managed through a third party library that did not propagate the Connection: Close
header properly and therefore led to plenty of half-closed connections.
Upvotes: 5