Designing a High-Performance, Multi-Goroutine Socket Server in Go

Leeting Yan

November 25, 2025

High-performance networking is one of Go’s strengths. With lightweight goroutines, a rich net package, and strong concurrency primitives, Go is a great fit for building custom TCP servers, game backends, proxies, and internal protocols.

In this article, we’ll design and implement a high-performance, multi-goroutine socket server in Go, with an architecture you can evolve into a real-world production system.

We’ll cover:

Architecture and design goals
A baseline TCP server
A multi-goroutine concurrency model
Connection limits and backpressure
Request handling with worker pools
Graceful shutdown and observability

1. Goals and Design Principles

We’ll design a server that:

Accepts many concurrent TCP connections.
Handles each connection in dedicated goroutines.
Uses a worker pool for heavier request processing.
Implements backpressure and connection limits.
Supports graceful shutdown (finish in-flight work).
Is easy to extend with custom protocols.

We’ll assume a simple binary protocol:

Each message is length-prefixed with a 4-byte big-endian uint32.
After the length comes that many bytes of payload.
The server echoes the payload back with the same framing.
In a real system, the payload would be JSON, protobuf, etc.

2. Project Layout

A simple layout:

socket-server/
  go.mod
  cmd/server/main.go
  internal/server/config.go
  internal/server/tcp_server.go
  internal/server/worker_pool.go
  internal/protocol/frame.go

For brevity, we’ll inline most code here, but you can easily split into files/modules for your project.

3. Configuration and Constants

Start with some configuration defaults.

package server

import "time"

type Config struct {
    ListenAddr         string        // e.g. ":9000"
    MaxConnections     int           // maximum concurrent clients
    ReadTimeout        time.Duration // per read
    WriteTimeout       time.Duration // per write
    IdleTimeout        time.Duration // overall idle
    WorkerPoolSize     int           // number of workers
    MaxMessageSize     int           // limit message size
    Backlog            int           // optional: backlog for channels
}

func DefaultConfig() Config {
    return Config{
        ListenAddr:     ":9000",
        MaxConnections: 10000,
        ReadTimeout:    10 * time.Second,
        WriteTimeout:   10 * time.Second,
        IdleTimeout:    60 * time.Second,
        WorkerPoolSize: 64,
        MaxMessageSize: 1 << 20, // 1MB
        Backlog:        1024,
    }
}

4. Protocol Framing: Length-Prefixed Messages

Let’s build a simple framing helper so net.Conn operations are clean.

package protocol

import (
    "bufio"
    "encoding/binary"
    "fmt"
    "io"
)

const headerSize = 4 // 4 bytes length prefix

// ReadFrame reads a single length-prefixed message from r.
func ReadFrame(r *bufio.Reader, maxSize int) ([]byte, error) {
    header := make([]byte, headerSize)
    if _, err := io.ReadFull(r, header); err != nil {
        return nil, err
    }

    length := binary.BigEndian.Uint32(header)
    if length == 0 {
        return nil, fmt.Errorf("empty frame")
    }
    if int(length) > maxSize {
        return nil, fmt.Errorf("frame too large: %d > %d", length, maxSize)
    }

    payload := make([]byte, length)
    if _, err := io.ReadFull(r, payload); err != nil {
        return nil, err
    }
    return payload, nil
}

// WriteFrame writes a length-prefixed message to w.
func WriteFrame(w io.Writer, payload []byte) error {
    header := make([]byte, headerSize)
    binary.BigEndian.PutUint32(header, uint32(len(payload)))

    if _, err := w.Write(header); err != nil {
        return err
    }
    _, err := w.Write(payload)
    return err
}

5. Worker Pool for Request Processing

We don’t want each connection goroutine to do heavy CPU or IO work. Instead, we’ll dispatch tasks to a shared worker pool.

package server

import (
    "context"
)

type Task struct {
    ConnID  uint64
    Payload []byte
    ResultC chan []byte
    ErrC    chan error
}

type WorkerPool struct {
    cfg    Config
    tasks  chan Task
    cancel context.CancelFunc
}

func NewWorkerPool(ctx context.Context, cfg Config) *WorkerPool {
    ctx, cancel := context.WithCancel(ctx)
    wp := &WorkerPool{
        cfg:    cfg,
        tasks:  make(chan Task, cfg.Backlog),
        cancel: cancel,
    }
    for i := 0; i < cfg.WorkerPoolSize; i++ {
        go wp.workerLoop(ctx, i)
    }
    return wp
}

func (wp *WorkerPool) Submit(task Task) {
    wp.tasks <- task
}

func (wp *WorkerPool) Close() {
    wp.cancel()
}

func (wp *WorkerPool) workerLoop(ctx context.Context, workerID int) {
    for {
        select {
        case <-ctx.Done():
            return
        case task := <-wp.tasks:
            // In a real server: parse payload, route, call business logic, DB, etc.
            // Here we "echo" with a small modification to show processing.
            processed := append([]byte{}, task.Payload...)
            // Example: prefix with worker ID
            result := append([]byte("worker-"), []byte(string('0'+workerID))...)
            result = append(result, ':')
            result = append(result, processed...)

            select {
            case task.ResultC <- result:
            case <-ctx.Done():
                task.ErrC <- ctx.Err()
            }
        }
    }
}

This is intentionally simple: each Task includes a response channel so the connection goroutine can resume once processing is done.

6. Connection Handling: One Goroutine per Connection

Now let’s implement the TCP server: accept connections, limit them, and handle each in a goroutine.

package server

import (
    "bufio"
    "context"
    "fmt"
    "log"
    "net"
    "sync"
    "sync/atomic"
    "time"

    "example.com/socket-server/internal/protocol"
)

type TCPServer struct {
    cfg        Config
    listener   net.Listener
    activeConn int64 // atomic
    nextID     uint64
    wp         *WorkerPool

    mu      sync.Mutex
    closing bool
    wg      sync.WaitGroup
}

func NewTCPServer(cfg Config) *TCPServer {
    return &TCPServer{
        cfg: cfg,
    }
}

func (s *TCPServer) Serve(ctx context.Context) error {
    // Setup listener with optional advanced options via net.ListenConfig
    ln, err := net.Listen("tcp", s.cfg.ListenAddr)
    if err != nil {
        return err
    }
    s.listener = ln
    log.Printf("Listening on %s", s.cfg.ListenAddr)

    // Start worker pool
    s.wp = NewWorkerPool(ctx, s.cfg)

    // Accept loop
    for {
        conn, err := ln.Accept()
        if err != nil {
            if s.isClosing() {
                return nil
            }
            log.Printf("Accept error: %v", err)
            continue
        }

        // Connection limit
        if atomic.LoadInt64(&s.activeConn) >= int64(s.cfg.MaxConnections) {
            log.Printf("Too many connections, rejecting new client")
            conn.Close()
            continue
        }

        atomic.AddInt64(&s.activeConn, 1)
        s.wg.Add(1)

        connID := atomic.AddUint64(&s.nextID, 1)
        go s.handleConn(ctx, connID, conn)
    }
}

func (s *TCPServer) handleConn(ctx context.Context, connID uint64, conn net.Conn) {
    defer func() {
        conn.Close()
        atomic.AddInt64(&s.activeConn, -1)
        s.wg.Done()
        log.Printf("Conn %d closed", connID)
    }()

    log.Printf("Conn %d accepted from %s", connID, conn.RemoteAddr())

    // Buffers
    reader := bufio.NewReader(conn)

    // Idle timeout management
    if s.cfg.IdleTimeout > 0 {
        _ = conn.SetDeadline(time.Now().Add(s.cfg.IdleTimeout))
    }

    for {
        // Reset read deadline
        if s.cfg.ReadTimeout > 0 {
            _ = conn.SetReadDeadline(time.Now().Add(s.cfg.ReadTimeout))
        }

        payload, err := protocol.ReadFrame(reader, s.cfg.MaxMessageSize)
        if err != nil {
            if netErr, ok := err.(net.Error); ok && netErr.Timeout() {
                log.Printf("Conn %d read timeout", connID)
            } else {
                log.Printf("Conn %d read error: %v", connID, err)
            }
            return
        }

        // Prepare task
        resultC := make(chan []byte, 1)
        errC := make(chan error, 1)

        task := Task{
            ConnID:  connID,
            Payload: payload,
            ResultC: resultC,
            ErrC:    errC,
        }

        // Submit to worker pool (this may block if backlog is full – backpressure)
        select {
        case <-ctx.Done():
            return
        case s.wp.tasks <- task: // direct access to channel for performance
        }

        // Wait for worker response or context cancellation
        select {
        case <-ctx.Done():
            return
        case err := <-errC:
            if err != nil {
                log.Printf("Conn %d worker error: %v", connID, err)
                return
            }
        case result := <-resultC:
            if s.cfg.WriteTimeout > 0 {
                _ = conn.SetWriteDeadline(time.Now().Add(s.cfg.WriteTimeout))
            }
            if err := protocol.WriteFrame(conn, result); err != nil {
                log.Printf("Conn %d write error: %v", connID, err)
                return
            }
        }
    }
}

func (s *TCPServer) isClosing() bool {
    s.mu.Lock()
    defer s.mu.Unlock()
    return s.closing
}

// Shutdown initiates graceful shutdown: stop accepting new connections,
// close listener, and wait for active connections.
func (s *TCPServer) Shutdown(ctx context.Context) error {
    s.mu.Lock()
    s.closing = true
    s.mu.Unlock()

    if s.listener != nil {
        if err := s.listener.Close(); err != nil {
            return err
        }
    }
    if s.wp != nil {
        s.wp.Close()
    }

    done := make(chan struct{})
    go func() {
        s.wg.Wait()
        close(done)
    }()

    select {
    case <-ctx.Done():
        return ctx.Err()
    case <-done:
        log.Printf("Graceful shutdown complete")
        return nil
    }
}

Note: For simplicity, we directly use s.wp.tasks in handleConn. You can wrap it with a method if you prefer.

7. Wiring It Up in `main.go`

A simple entry point that supports graceful shutdown with signals:

package main

import (
    "context"
    "log"
    "os"
    "os/signal"
    "syscall"
    "time"

    "example.com/socket-server/internal/server"
)

func main() {
    cfg := server.DefaultConfig()
    srv := server.NewTCPServer(cfg)

    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()

    // Signal handling for graceful shutdown
    sigC := make(chan os.Signal, 1)
    signal.Notify(sigC, syscall.SIGINT, syscall.SIGTERM)

    go func() {
        sig := <-sigC
        log.Printf("Received signal: %v, shutting down...", sig)
        shutdownCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
        defer cancel()
        if err := srv.Shutdown(shutdownCtx); err != nil {
            log.Printf("Shutdown error: %v", err)
        }
        cancel() // stop main context
    }()

    if err := srv.Serve(ctx); err != nil {
        log.Printf("Server error: %v", err)
    }
}

Compile and run:

go mod init example.com/socket-server
go mod tidy
go run ./cmd/server

8. Performance Considerations

8.1 Goroutine Model

We use:

One goroutine per connection (read + write handling).
Fixed-size worker pool for heavier processing.

This gives:

Simple code.
Good performance for thousands of connections.
Predictable CPU usage via pool size.

For ultra-high scale (hundreds of thousands of connections), you might explore:

Sharded event loops.
More advanced scheduling patterns.
Connection batching.

But for most backends, the “goroutine per connection + worker pool” pattern is more than enough.

8.2 Backpressure and Overload

Our design introduces backpressure via:

Bounded worker task channel (Backlog).
MaxConnections limit.

When overloaded:

Newly accepted connections can be rejected.
Writes to tasks block, slowing down producers (connection goroutines).

You can also add:

Load shedding: detect queue saturation and send “server busy” responses.
Prioritization: high priority tasks in separate channels.

8.3 Deadlines and Timeouts

We used:

ReadTimeout and WriteTimeout: limit individual operations.
IdleTimeout: via overall SetDeadline, avoids stale connections.

Tuning depends on your use case:

Low-latency APIs: use smaller timeouts.
Streaming: longer timeouts and heartbeat pings.

8.4 Keep-Alive and Socket Options

For long-lived connections, you can tune TCP options:

if tcpConn, ok := conn.(*net.TCPConn); ok {
    tcpConn.SetKeepAlive(true)
    tcpConn.SetKeepAlivePeriod(30 * time.Second)
}

You can also use net.ListenConfig to set advanced socket options like SO_REUSEADDR or SO_REUSEPORT for multi-process scenarios.

9. Observability: Metrics and Logging

To make this production-ready, add:

Prometheus metrics:
- active_connections
- requests_total
- request_duration_seconds
- worker_queue_length
Structured logging with request IDs.
Tracing (e.g., OpenTelemetry) at protocol message boundaries.

Example (conceptual) metric update:

// After accepting a connection
atomic.AddInt64(&s.activeConn, 1)
// Also update a gauge:
// activeConnectionsGauge.Inc()

10. Where to Go Next

You now have a high-performance, multi-goroutine TCP server architecture in Go, with:

A clear connection model
Worker pool processing
Backpressure and limits
Timeouts and graceful shutdown
A clean, extensible protocol framing layer

You can evolve this into:

A custom game server (rooms, sessions, match logic).
A binary RPC protocol.
A gateway/proxy for HTTP, WebSocket, or gRPC.
A streaming server for events and logs.