Reputation: 2610
We run thousands of Python scripts on our RHEL machines that open and close socket connections on port 8088. As a result, we are facing a high volume of HTTP requests.
Here is very simple example of one of the scripts:
import socket
import requests
def get_yarn_details(state='RUNNING'):
state_suffix = '' if state == '' else "?states=" + state
yarn_apps = "http://{0}:8088/ws/v1/cluster/apps" + state_suffix
local_fqdn = socket.getfqdn(socket.gethostname())
yarn_apps_url = yarn_apps.format(local_fqdn)
try:
response = requests.get(yarn_apps_url)
response.raise_for_status()ow
apps = response.json().get('apps', {}).get('app', [])
for app in apps:
print(app['id'])
except requests.exceptions.RequestException as e:
print(f"Error fetching YARN details: {e}")
# Example usage
get_yarn_details()
The problem is that, from time to time, we see a large number of CLOSE_WAIT sessions, and it seems that the high volume of HTTP requests is the root cause. For example, when I count the CLOSE_WAIT sessions (on the Resource Manager machine), I get the following result:
netstat -tn | grep ':8088' | grep CLOSE_WAIT | wc -l
3945
as we can see above we have 3945 close wait
and the list looks like this
netstat -tn | grep ':8088' | grep CLOSE_WAIT
tcp 192 0 85.3.45.239:8088 85.3.45.240:61614 CLOSE_WAIT
tcp 187 0 85.3.45.239:8088 85.3.45.240:12594 CLOSE_WAIT
tcp 195 0 85.3.45.239:8088 85.3.45.239:25532 CLOSE_WAIT
.
.
.
.
.
.
.
So, we are in big trouble and need advice from the members here about any ideas on what we can do to avoid closed wait sessions that come from huge numbers of HTTP requests.
Upvotes: 0
Views: 36