Hisnessness
Hisnessness

Reputation: 243

Can't open more than 1023 sockets

I'm developing some code that is simulating network equipment. I need to run several thousand simulated "agents", and each needs to connect to a service. The problem is that after opening 1023 connections, the connects start to time out, and the whole thing comes crashing down.

The main code is in Go, but I've written a very trivial python script which reproduces the problem.

The one thing that is somewhat unusual is that we need to set the local address on the socket when we create it. This is because the equipment that the agents are connecting to expects the apparent IP to match what we say it should be. To achieve this, I have configured 10,000 virtual interfaces (eth0:1 to eth0:10000). These are assigned unique IP addresses in a private network.

The python script is just this (only runs to 2000 connnects):

import socket

i = 0
for b in range(10, 30):
    for d in range(1, 100):
        i += 1
        ip = "1.%d.1.%d" % (b, d)
        print("Conn %i   %s" % (i, ip))
        s = socket.create_connection(("1.6.1.1", 5060), 10, (ip, 5060))

If I remove the last argument to socket.create_connection (the source address), then I can get all 2000 connections.

The thing that is different with using a local address is that a bind must be made before the connection can be set up, so the output from this program running under strace looks like this:

Conn 1023   1.20.1.33
bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
bind(3, {sa_family=AF_INET, sin_port=htons(5060), sin_addr=inet_addr("1.20.1.33")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(5060), sin_addr=inet_addr("1.6.1.1")}, 16) = -1 EINPROGRESS (Operation now in progress)

If I run without a local address, the AF_INET bind goes away, and it works.

So, it seems there must be some kind of limit on the number of binds that can be made. I've waded through all sorts of links about TCP tuning on Linux, and I've tried messing with tcp_tw_reuse/recycle and I've reduced the fin_timeout, and I've done other things that I can't remember.

This is running on Ubuntu Linux (11.04, kernel 2.6.38 (64 bit). It's a virtual machine on a VMWare ESX cluster.

Just before posting this, I tried running a second instances of the python script, with the additional starting at 1.30.1.1. The first script plowed through to 1023 connections, but the second one couldn't even get the first one done, indicating that the problem is related to the large number of virtual interfaces. Could some internal data structure be limited? Some max memory setting somewhere?

Can anyone think of some limit in Linux that would cause this?

Update:

This morning I decided to try an experiment. I modified the python script to use the "main" interface IP as the source IP, and ephemeral ports in the range 10000+. The script now looks like this:

import socket

i = 0
for i in range(1, 2000):
    print("Conn %i" % i)
    s = socket.create_connection(("1.6.1.1", 5060), 10, ("1.1.1.30", i + 10000))

This script works just fine, so this adds to my belief that the problem is related to the large number of aliased IP addresses.

Upvotes: 4

Views: 4115

Answers (3)

Hisnessness
Hisnessness

Reputation: 243

What a DOH moment. I was watching the server, using netstat, and since I didn't see a large number of connects I didn't think there was a problem. But finally I wised up and checked the /var/log/kernel, in which I found this:

Mar  8 11:03:52 TestServer01 kernel: ipv4: Neighbour table overflow.

This lead me to this posting: http://www.serveradminblog.com/2011/02/neighbour-table-overflow-sysctl-conf-tunning/ which explains how to increase the limit. Bumping the thresh3 value immediately solved the problem.

Upvotes: 2

Anya Shenanigans
Anya Shenanigans

Reputation: 94654

Are you absolutely certain that the issue is not on the server side connection not closing the sockets? i.e. what does lsof -n -p of the server process show? What does plimit -p of the server process show? The server side could be tied up not being able to accept any more connections, while the client side is getting the EINPROGRESS result.

Check the ulimit for the number of open files on both sides of the connection - 1024 is too close to a ulimit level to be a coincidence.

Upvotes: 0

Matt Joyce
Matt Joyce

Reputation: 2010

You may want to look at sysctl settings related to net.ipv4 .

These settings include stuff like maxconntrack and other relevant settings you may wish to tweak.

Upvotes: 0

Related Questions