Designing a High-Performance, Multi-Goroutine Socket Server in Go
Leeting Yan
High-performance networking is one of Go’s strengths. With lightweight goroutines, a rich net package, and strong concurrency primitives, Go is a great fit for building custom TCP servers, game backends, proxies, and internal protocols.
In this article, we’ll design and implement a high-performance, multi-goroutine socket server in Go, with an architecture you can evolve into a real-world production system.
We’ll cover:
- Architecture and design goals
- A baseline TCP server
- A multi-goroutine concurrency model
- Connection limits and backpressure
- Request handling with worker pools
- Graceful shutdown and observability
1. Goals and Design Principles
We’ll design a server that:
- Accepts many concurrent TCP connections.
- Handles each connection in dedicated goroutines.
- Uses a worker pool for heavier request processing.
- Implements backpressure and connection limits.
- Supports graceful shutdown (finish in-flight work).
- Is easy to extend with custom protocols.
We’ll assume a simple binary protocol:
- Each message is length-prefixed with a 4-byte big-endian uint32.
- After the length comes that many bytes of payload.
- The server echoes the payload back with the same framing.
- In a real system, the payload would be JSON, protobuf, etc.
2. Project Layout
A simple layout:
socket-server/
go.mod
cmd/server/main.go
internal/server/config.go
internal/server/tcp_server.go
internal/server/worker_pool.go
internal/protocol/frame.go
For brevity, we’ll inline most code here, but you can easily split into files/modules for your project.
3. Configuration and Constants
Start with some configuration defaults.
package server
import "time"
type Config struct {
ListenAddr string // e.g. ":9000"
MaxConnections int // maximum concurrent clients
ReadTimeout time.Duration // per read
WriteTimeout time.Duration // per write
IdleTimeout time.Duration // overall idle
WorkerPoolSize int // number of workers
MaxMessageSize int // limit message size
Backlog int // optional: backlog for channels
}
func DefaultConfig() Config {
return Config{
ListenAddr: ":9000",
MaxConnections: 10000,
ReadTimeout: 10 * time.Second,
WriteTimeout: 10 * time.Second,
IdleTimeout: 60 * time.Second,
WorkerPoolSize: 64,
MaxMessageSize: 1 << 20, // 1MB
Backlog: 1024,
}
}
4. Protocol Framing: Length-Prefixed Messages
Let’s build a simple framing helper so net.Conn operations are clean.
package protocol
import (
"bufio"
"encoding/binary"
"fmt"
"io"
)
const headerSize = 4 // 4 bytes length prefix
// ReadFrame reads a single length-prefixed message from r.
func ReadFrame(r *bufio.Reader, maxSize int) ([]byte, error) {
header := make([]byte, headerSize)
if _, err := io.ReadFull(r, header); err != nil {
return nil, err
}
length := binary.BigEndian.Uint32(header)
if length == 0 {
return nil, fmt.Errorf("empty frame")
}
if int(length) > maxSize {
return nil, fmt.Errorf("frame too large: %d > %d", length, maxSize)
}
payload := make([]byte, length)
if _, err := io.ReadFull(r, payload); err != nil {
return nil, err
}
return payload, nil
}
// WriteFrame writes a length-prefixed message to w.
func WriteFrame(w io.Writer, payload []byte) error {
header := make([]byte, headerSize)
binary.BigEndian.PutUint32(header, uint32(len(payload)))
if _, err := w.Write(header); err != nil {
return err
}
_, err := w.Write(payload)
return err
}
5. Worker Pool for Request Processing
We don’t want each connection goroutine to do heavy CPU or IO work. Instead, we’ll dispatch tasks to a shared worker pool.
package server
import (
"context"
)
type Task struct {
ConnID uint64
Payload []byte
ResultC chan []byte
ErrC chan error
}
type WorkerPool struct {
cfg Config
tasks chan Task
cancel context.CancelFunc
}
func NewWorkerPool(ctx context.Context, cfg Config) *WorkerPool {
ctx, cancel := context.WithCancel(ctx)
wp := &WorkerPool{
cfg: cfg,
tasks: make(chan Task, cfg.Backlog),
cancel: cancel,
}
for i := 0; i < cfg.WorkerPoolSize; i++ {
go wp.workerLoop(ctx, i)
}
return wp
}
func (wp *WorkerPool) Submit(task Task) {
wp.tasks <- task
}
func (wp *WorkerPool) Close() {
wp.cancel()
}
func (wp *WorkerPool) workerLoop(ctx context.Context, workerID int) {
for {
select {
case <-ctx.Done():
return
case task := <-wp.tasks:
// In a real server: parse payload, route, call business logic, DB, etc.
// Here we "echo" with a small modification to show processing.
processed := append([]byte{}, task.Payload...)
// Example: prefix with worker ID
result := append([]byte("worker-"), []byte(string('0'+workerID))...)
result = append(result, ':')
result = append(result, processed...)
select {
case task.ResultC <- result:
case <-ctx.Done():
task.ErrC <- ctx.Err()
}
}
}
}
This is intentionally simple: each Task includes a response channel so the connection goroutine can resume once processing is done.
6. Connection Handling: One Goroutine per Connection
Now let’s implement the TCP server: accept connections, limit them, and handle each in a goroutine.
package server
import (
"bufio"
"context"
"fmt"
"log"
"net"
"sync"
"sync/atomic"
"time"
"example.com/socket-server/internal/protocol"
)
type TCPServer struct {
cfg Config
listener net.Listener
activeConn int64 // atomic
nextID uint64
wp *WorkerPool
mu sync.Mutex
closing bool
wg sync.WaitGroup
}
func NewTCPServer(cfg Config) *TCPServer {
return &TCPServer{
cfg: cfg,
}
}
func (s *TCPServer) Serve(ctx context.Context) error {
// Setup listener with optional advanced options via net.ListenConfig
ln, err := net.Listen("tcp", s.cfg.ListenAddr)
if err != nil {
return err
}
s.listener = ln
log.Printf("Listening on %s", s.cfg.ListenAddr)
// Start worker pool
s.wp = NewWorkerPool(ctx, s.cfg)
// Accept loop
for {
conn, err := ln.Accept()
if err != nil {
if s.isClosing() {
return nil
}
log.Printf("Accept error: %v", err)
continue
}
// Connection limit
if atomic.LoadInt64(&s.activeConn) >= int64(s.cfg.MaxConnections) {
log.Printf("Too many connections, rejecting new client")
conn.Close()
continue
}
atomic.AddInt64(&s.activeConn, 1)
s.wg.Add(1)
connID := atomic.AddUint64(&s.nextID, 1)
go s.handleConn(ctx, connID, conn)
}
}
func (s *TCPServer) handleConn(ctx context.Context, connID uint64, conn net.Conn) {
defer func() {
conn.Close()
atomic.AddInt64(&s.activeConn, -1)
s.wg.Done()
log.Printf("Conn %d closed", connID)
}()
log.Printf("Conn %d accepted from %s", connID, conn.RemoteAddr())
// Buffers
reader := bufio.NewReader(conn)
// Idle timeout management
if s.cfg.IdleTimeout > 0 {
_ = conn.SetDeadline(time.Now().Add(s.cfg.IdleTimeout))
}
for {
// Reset read deadline
if s.cfg.ReadTimeout > 0 {
_ = conn.SetReadDeadline(time.Now().Add(s.cfg.ReadTimeout))
}
payload, err := protocol.ReadFrame(reader, s.cfg.MaxMessageSize)
if err != nil {
if netErr, ok := err.(net.Error); ok && netErr.Timeout() {
log.Printf("Conn %d read timeout", connID)
} else {
log.Printf("Conn %d read error: %v", connID, err)
}
return
}
// Prepare task
resultC := make(chan []byte, 1)
errC := make(chan error, 1)
task := Task{
ConnID: connID,
Payload: payload,
ResultC: resultC,
ErrC: errC,
}
// Submit to worker pool (this may block if backlog is full – backpressure)
select {
case <-ctx.Done():
return
case s.wp.tasks <- task: // direct access to channel for performance
}
// Wait for worker response or context cancellation
select {
case <-ctx.Done():
return
case err := <-errC:
if err != nil {
log.Printf("Conn %d worker error: %v", connID, err)
return
}
case result := <-resultC:
if s.cfg.WriteTimeout > 0 {
_ = conn.SetWriteDeadline(time.Now().Add(s.cfg.WriteTimeout))
}
if err := protocol.WriteFrame(conn, result); err != nil {
log.Printf("Conn %d write error: %v", connID, err)
return
}
}
}
}
func (s *TCPServer) isClosing() bool {
s.mu.Lock()
defer s.mu.Unlock()
return s.closing
}
// Shutdown initiates graceful shutdown: stop accepting new connections,
// close listener, and wait for active connections.
func (s *TCPServer) Shutdown(ctx context.Context) error {
s.mu.Lock()
s.closing = true
s.mu.Unlock()
if s.listener != nil {
if err := s.listener.Close(); err != nil {
return err
}
}
if s.wp != nil {
s.wp.Close()
}
done := make(chan struct{})
go func() {
s.wg.Wait()
close(done)
}()
select {
case <-ctx.Done():
return ctx.Err()
case <-done:
log.Printf("Graceful shutdown complete")
return nil
}
}
Note: For simplicity, we directly use
s.wp.tasksinhandleConn. You can wrap it with a method if you prefer.
7. Wiring It Up in main.go
A simple entry point that supports graceful shutdown with signals:
package main
import (
"context"
"log"
"os"
"os/signal"
"syscall"
"time"
"example.com/socket-server/internal/server"
)
func main() {
cfg := server.DefaultConfig()
srv := server.NewTCPServer(cfg)
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// Signal handling for graceful shutdown
sigC := make(chan os.Signal, 1)
signal.Notify(sigC, syscall.SIGINT, syscall.SIGTERM)
go func() {
sig := <-sigC
log.Printf("Received signal: %v, shutting down...", sig)
shutdownCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
if err := srv.Shutdown(shutdownCtx); err != nil {
log.Printf("Shutdown error: %v", err)
}
cancel() // stop main context
}()
if err := srv.Serve(ctx); err != nil {
log.Printf("Server error: %v", err)
}
}
Compile and run:
go mod init example.com/socket-server
go mod tidy
go run ./cmd/server
8. Performance Considerations
8.1 Goroutine Model
We use:
- One goroutine per connection (read + write handling).
- Fixed-size worker pool for heavier processing.
This gives:
- Simple code.
- Good performance for thousands of connections.
- Predictable CPU usage via pool size.
For ultra-high scale (hundreds of thousands of connections), you might explore:
- Sharded event loops.
- More advanced scheduling patterns.
- Connection batching.
But for most backends, the “goroutine per connection + worker pool” pattern is more than enough.
8.2 Backpressure and Overload
Our design introduces backpressure via:
- Bounded worker task channel (
Backlog). MaxConnectionslimit.
When overloaded:
- Newly accepted connections can be rejected.
- Writes to
tasksblock, slowing down producers (connection goroutines).
You can also add:
- Load shedding: detect queue saturation and send “server busy” responses.
- Prioritization: high priority tasks in separate channels.
8.3 Deadlines and Timeouts
We used:
ReadTimeoutandWriteTimeout: limit individual operations.IdleTimeout: via overallSetDeadline, avoids stale connections.
Tuning depends on your use case:
- Low-latency APIs: use smaller timeouts.
- Streaming: longer timeouts and heartbeat pings.
8.4 Keep-Alive and Socket Options
For long-lived connections, you can tune TCP options:
if tcpConn, ok := conn.(*net.TCPConn); ok {
tcpConn.SetKeepAlive(true)
tcpConn.SetKeepAlivePeriod(30 * time.Second)
}
You can also use net.ListenConfig to set advanced socket options like SO_REUSEADDR or SO_REUSEPORT for multi-process scenarios.
9. Observability: Metrics and Logging
To make this production-ready, add:
Prometheus metrics:
active_connectionsrequests_totalrequest_duration_secondsworker_queue_length
Structured logging with request IDs.
Tracing (e.g., OpenTelemetry) at protocol message boundaries.
Example (conceptual) metric update:
// After accepting a connection
atomic.AddInt64(&s.activeConn, 1)
// Also update a gauge:
// activeConnectionsGauge.Inc()
10. Where to Go Next
You now have a high-performance, multi-goroutine TCP server architecture in Go, with:
- A clear connection model
- Worker pool processing
- Backpressure and limits
- Timeouts and graceful shutdown
- A clean, extensible protocol framing layer
You can evolve this into:
- A custom game server (rooms, sessions, match logic).
- A binary RPC protocol.
- A gateway/proxy for HTTP, WebSocket, or gRPC.
- A streaming server for events and logs.