staticFlow
staticFlow

Reputation: 149

Why does using multiple ethernet connections slow down throughput of I/O bound task

So I've got an interesting problem that seems counterintuitive to me. I am building a tool where the biggest bottleneck is the rate at which I can send packets. Currently I can handle over a million requests in less than 30 seconds which is great but I'm trying to squeeze out as much speed as possible. My idea was to attach a second ethernet adapter to the machine and spin up two different net.Dialer's like so

net.Dialer{
        Timeout:   time.Duration(*timeoutPtr) * time.Second,
        LocalAddr: addr,
}

where addr is one of the two ethernet adapters. Then I assign the dialers to a job round robin style like so:

for i, target :=  range targets {
    dialer = dialers[i%len(dialers)]
    ....
    go someNetworkFunction(dialer)
}

What's surprising to me is that when I run it with 2 adapters it executes much much slower, 30 seconds vs 2 minutes! I'm just trying to understand why giving the code two connections to send packets slows down the code instead of speeding it up. It doesn't appear that the modulus operation there should cause a 300% slowdown. Is there something happening at the kernel layer when trying to use both adapters to send at the same time? Any help would be appreciated.

Upvotes: 3

Views: 79

Answers (1)

Norbert
Norbert

Reputation: 6084

There can be multiple factors in play:

  1. Kernel or benchmark process in general.

If you run a profile of your app, which with mln request in 30seconds, is spending not too much application time, you will probably see that syscall is using more of your time. syscall represents the (out of view) cpu time spend out of view of your application.

If this syscall time increases non-linear compared to the process you are benchmarking, you have a bottleneck outside of your program.

  1. Go routines

go routines are scheduled against CPU cores (on the physical level). While they are easy create, the actual switching between go routines is not overhead free. The implementation of someNetworkFunction can make a difference in the throughput where you can block resources, or just switch too often. You can try to manage this by instructing the go program to use less threads with GOMAXPROCS. By tweaking this value, you can determine for your program and hardware what an optimal value is.

A more in depth explanation of the scheduler can be found at https://rakyll.org/scheduler/

Upvotes: 2

Related Questions