Nicklas Blomqvist
Nicklas Blomqvist

Reputation: 183

Timeout after 1024 connections from different virtual IPs

I have a problem I have struggled with some time now without solving it.

I am implementing a traffic generator which has a client and a server side. The client side simulates devices with unique IP-addresses. These IPs are added as a virtual interface on the client which the simulated device then binds to. The devices will then connect to the server side and generate traffic to/from it.

The problem is that I can only connect with 1023 devices, then the following devices gets a time out on the connect. I have checked in wireshark on the server side and I can see the SYN for the connection but it is never received in the application. When I reuse IPs so that less than 2014 is used I can make as many connections as I want.

I have created a python program that is easier to run and has the same problem:

client.py

import socket
import thread
import time
from subprocess import call

TCP_IP = '192.168.169.218'
TCP_PORT = 9999
BUFFER_SIZE = 1024
MESSAGE = "-" * 1000

if __name__=='__main__':
    sockets = []
    for i in range(0, 10020):
        ip = "13.1."+ str(((i/254)%254) + 1) + "." + str((i % 254) + 1)
        cmd = "ip addr add " + ip + " dev eth1;"
        call(cmd, shell=True)
        s = socket.create_connection((TCP_IP, TCP_PORT), 10, (ip, 0))
        sockets.append(s)

    while 1:
        for s in sockets:
            s.send(MESSAGE)
            data = s.recv(BUFFER_SIZE)

    for s in sockets:
        s.close()

server.py

from socket import *
import thread

BUFF = 1024
HOST = '192.168.169.218'
PORT = 9999

def handler(clientsock,addr):
    while 1:
        data = clientsock.recv(BUFF)
        if not data: break
        clientsock.send(data)
        # type 'close' on client console to close connection from the server side
        if "close" == data.rstrip(): break     
    clientsock.close()
    print addr, "- closed connection" #log on console

if __name__=='__main__':
    count = 0   
    ADDR = (HOST, PORT)
    serversock = socket(AF_INET, SOCK_STREAM)
    serversock.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
    serversock.bind(ADDR)
    serversock.listen(5)
    while 1:
        print 'waiting for connection... listening on port', PORT
        clientsock, addr = serversock.accept()
        count += 1
        print count
        thread.start_new_thread(handler, (clientsock, addr))

I'm running CentOS 7.1 64bit and the Python version I have tested with is 2.7.5.

What I have done so far:
- Increased the number of open files limit(nofile) to 1040000
- Increased net.core.somaxconn to 65535
- Increased net.ipv4.tcp_max_syn_backlog and net.core.netdev_max_backlog to 30000
- Increased core and TCP buffers
- Disabled firewalls and cleared all iptables rules

I tested to let the python client sleep one second after each connection and then there was no problem so my guess is that there is some flood protection that kicks in or something. Anyone that has any ideas?

Upvotes: 1

Views: 163

Answers (1)

VenkatC
VenkatC

Reputation: 627

Interesting question, so I did a test with my VM's. I was able to find that, you are hitting limit on ARP neighbour entries

# sysctl -a|grep net.ipv4.neigh.default.gc_thresh
net.ipv4.neigh.default.gc_thresh1 = 128
net.ipv4.neigh.default.gc_thresh2 = 512
net.ipv4.neigh.default.gc_thresh3 = 1024

Above are default values and when your 1024th connection fills up this table, garbage collector starts to run arp again - which is slowing down and causing your client to timeout

I was able to set these values as below

net.ipv4.neigh.default.gc_thresh1 = 16384
net.ipv4.neigh.default.gc_thresh2 = 32768
net.ipv4.neigh.default.gc_thresh3 = 65536

and voila !! no more 1024 limit ..HTH

Upvotes: 1

Related Questions