The Network
The WebSocket relay
Lesson 6 of 10
What you'll learn
- Understand why mDNS can't reach a node on another network
- See how a relay uses persistent WebSocket connections to bridge them
- Model the hub that registers, routes, and broadcasts messages
mDNS dies at the edge of your LAN — multicast doesn't cross routers. So your laptop on café Wi-Fi can't discover the GPU box at home. The fix is a relay: a small, publicly reachable server every node dials out to. Because nodes open the connection, there are no inbound firewall holes or port forwarding — the relay just holds the sockets open and shuttles messages between them.
Why WebSockets and not plain HTTP requests? HTTP is one-shot and client-initiated: the server can't call you back. The relay needs to push a request to a node at any moment, so it needs a persistent, bidirectional channel. That's exactly what a WebSocket is — one long-lived connection either side can write to.
A hub of connections
The relay keeps an in-memory map of node ID to live socket. Three operations cover everything: register on connect, route a message to one node, broadcast to all.
// apps/relay/internal/ws — simplified hub
type Hub struct {
mu sync.RWMutex
conns map[string]*websocket.Conn // nodeID -> socket
}
func (h *Hub) Register(id string, c *websocket.Conn) {
h.mu.Lock(); h.conns[id] = c; h.mu.Unlock()
}
func (h *Hub) Route(target string, msg []byte) error {
h.mu.RLock(); c := h.conns[target]; h.mu.RUnlock()
if c == nil {
return fmt.Errorf("node %s not connected", target)
}
return c.WriteMessage(websocket.TextMessage, msg)
}
When a request arrives at the relay's HTTP API for a model that lives on a remote node, the relay routes it over that node's socket, the node runs inference, and the answer streams back over the same connection. The relay never sees your models or runs inference — it's a switchboard.
The relay is stateless about content
It knows which nodes are connected and forwards bytes; it doesn't store prompts, models, or outputs. Presence is mirrored to the control plane (Convex) so the dashboard can show who's online, but the relay itself is just live sockets in memory.
The challenge models the hub: register a couple of connections, route to one, and broadcast to all.
Run it. Each 'connection' just logs what it receives. Route hits one node; broadcast hits every node.
Why does the relay use a WebSocket instead of ordinary HTTP requests to each node?
Next: the request format itself — serving an OpenAI-compatible API with streamed tokens.
Saved on this device. Sign in to sync your progress everywhere.