Reputation: 59
I want to develop a software to handle request from multiple tcp connections using GoLang, and run on the server with 10Gb-nic.
It seems that the performance is not enough to recv/send data on the single core. So I want to implement the software to recv/send data on multiple cpu cores.
Then I made a simple test server to check whether GoLang can recv/send data on multiple cpu cores or not. It launch multiple(16) goroutines to start http server on the same listener, and use ab(Apache Benchmark) as client.
After the server start, I have seen only one thread invoke EpollWait,but the server launched 18 threads, and when I start ab to test using 16 concurrency, but the server occupy only one core.
So the question: is there any way to launch multiple threads to handle data recv/send from multiple tcp connections in GoLang. Or Should I have to invoke syscall.EpollWait to make a Network Framework, to do it myself?
The server's test code:
package main
import (
"io"
"log"
"net"
"net/http"
"runtime"
)
type HandlerFunction struct{}
func (self HandlerFunction) ServeHTTP(w http.ResponseWriter, req *http.Request) {
data := "Hello"
//fmt.Printf("data_len=%d\n", len(data))
io.WriteString(w, string(data))
}
func RoutineFunction(hs *http.Server, l net.Listener) {
runtime.LockOSThread()
err := hs.Serve(l)
if err != nil {
log.Fatalf("serve fail, err=[%s]", err)
}
}
func main() {
runtime.GOMAXPROCS(16)
l, err := net.Listen("tcp", "0.0.0.0:12345")
if err != nil {
log.Fatalf("listen fail, err=[%s]", err)
}
for i := 0; i < 15; i++ {
hs := http.Server{}
hs.Handler = HandlerFunction{}
go RoutineFunction(&hs, l)
}
hs := http.Server{}
hs.Handler = HandlerFunction{}
RoutineFunction(&hs, l)
}
Upvotes: 2
Views: 4678
Reputation: 109443
Not exactly.
The Go runtime (as of go1.5) uses a single network poller. When you have actual work do be done in the server, this is rarely the bottleneck, and the threads running goroutines will be kept busy. In some cases though, either with enough cores, or enough throughput, the Go runtime will start to suffer, especially since the poller will often be in a different NUMA node than the thread doing the IO.
If you need to run at that scale, I current suggest limiting the Go server to a single NUMA node, and running multiple instances of the server.
The exception to this is that if you put the socket into blocking mode, then IO on that socket will be bound to a single OS thread. I haven't done any throughput tests on this method to see if there's any benefit, but if you're using relatively few sockets concurrently, it couldn't hurt to try.
Upvotes: 6