Ruifeng Ma
Ruifeng Ma

Reputation: 2787

Understanding the lifecycle of a connection managed by PoolingHttpClientConnectionManager in Apache HTTP client

I felt very confused after reading the Connection Management doc of the Apache HTTP components module, and also a few other resources on connection keep alive strategy and connection eviction policy.

There are a bunch of adjectives used in there to describe the state of a connection like stale, idle, available, expired and closed etc. There isn't a lifecycle diagram describing how a connection changes among these states.

My confusion mainly arose from below situation.

I set a ConnectionKeepAliveStrategy that provides a KeepAliveDuration of 5 seconds via below code snippet.

        ConnectionKeepAliveStrategy keepAliveStrategy = ( httpResponse, httpContext ) -> {
            HeaderElementIterator iterator = 
                 new BasicHeaderElementIterator( httpResponse.headerIterator( HTTP.CONN_KEEP_ALIVE ) );
            while ( iterator.hasNext() )
            {
                HeaderElement header = iterator.nextElement();
                if ( header.getValue() != null && header.getName().equalsIgnoreCase( "timeout" ) )
                {
                    return Long.parseLong( header.getValue(), 10) * 1000;
                }
            }
            return 5 * 1000;
        };
        this.client = HttpAsyncClients.custom()
                .setDefaultRequestConfig( requestConfig )
                .setMaxConnTotal( 500 )    
                .setMaxConnPerRoute( 500 )
                .setConnectionManager( this.cm )  
                .setKeepAliveStrategy( keepAliveStrategy )
                .build();

The server I am talking to does support connections to be kept alive. When I printed out the pool stats of the connection manager after executing around ~200 requests asynchronously in a single batch, below info was observed.

Total Stats:
-----------------
Available: 139
Leased: 0
Max: 500
Pending: 0

And after waiting for 30 seconds (by then the keep-alive timeout had long been exceeded), I started a new batch of the same HTTP calls. Upon inspecting the connection manager pool stats, the number of available connections are is still 139.

Shouldn't it be zero since the keep-alive timeout had been reached? The PoolStats Java doc states that Available is "the number of idle persistent connections". Are idle persistent connections considered alive?

I think Apache HttpClient: How to auto close connections by server's keep-alive time is a close hit but hope some expert could give an insightful explanation about the lifecycle of a connection managed by PoolingHttpClientConnectionManager.

Some other general questions:

  1. Does the default connection manager used in HttpAsyncClients.createdDefault() handle connection keep-alive strategy and connection eviction on its own?
  2. What are the requirements/limitations that could call for implementing them on a custom basis? Will they contradict each other?

Upvotes: 3

Views: 4316

Answers (1)

Ruifeng Ma
Ruifeng Ma

Reputation: 2787

Documenting some of my further findings which might partially fulfill as an answer.

  1. Whether using a ConnectionKeepAliveStrategy to set a timeout on the keep alive session or not, the connections will end up in the TCP state of ESTABLISHED, as inspected via netstat -apt. And I observed that they are automatically recycled after around 5 minutes in my Linux test environment.

  2. When NOT using a ConnectionKeepAliveStrategy, upon a second request batch the established connections will be reused.

  3. When using a ConnectionKeepAliveStrategy and its timeout has NOT been reached, upon a second request batch the established connections will be reused.

  4. When using a ConnectionKeepAliveStrategy and its timeout has been exceeded, upon a second request batch, the established connections will be recycled into the TIME_WAIT state, indicating that client side has decided to close the connections.

  5. This recycling can be actively exercised by performing connectionManager.closeExpiredConnections(); in a separate connection evicting thread, which will lead the connections into TIME_WAIT stage.

I think the general observation is that ESTABLISHED connections are deemed as Available by the connection pool stats, and the connection keep alive strategy with a timeout does put the connections into expiry, but it only takes effect when new requests are processed, or when we specifically instruct the connection manager to close expired connections.

TCP state diagram from Wikipedia for reference.

Upvotes: 5

Related Questions