WebRTC Build Machine & Data Channels

The build machine is the component that actually runs your build. It speaks WebRTC directly to your browser or MCP client, enforces permission-based access on every channel, and runs the build itself inside a sandbox.

Anatomy of a build machine node

A build machine is a Linux host with:

The aegis-agent systemd service
A toolchain store containing installed IDF versions and toolchains
Sandbox software installed
Outbound HTTPS access to the build server (no inbound ports needed)
A WebRTC stack capable of acting as a peer

Importantly, the build machine never opens an inbound TCP port for the build flow. It establishes outbound connections to the build server to check for jobs, and outbound WebRTC peer connections to clients via the ICE servers the build server provides.

Checking for jobs

The build machine runs a loop:

every 2 seconds:
  GET ${CONTROL_BASE_URL}/agents/jobs?agent_id=...
  for each new job:
    verify permission signature locally
    if permission.peer_fingerprint matches the offer we're about to receive:
      open WebRTC peer connection with the configured ICE servers
      negotiate channels per permission.allowed_channels
      run build, stream results

The check loop is silent at INFO level – only errors are logged. This is intentional; a chatty log is hard to read. To see the check activity, operators set RUST_LOG=debug.

The three data channels (server-side enforcement)

When a peer connection negotiates channels, the build machine enforces:

Channel name whitelist – only channels listed in permission.allowed_channels are accepted. The build machine will reject (and close) any channel not in the whitelist immediately upon ondatachannel.
Per-channel handlers – espctl, pty, and firmware each have a dedicated handler that knows the message format and produces structured events. Unknown channel names get rejected even if they were granted (they have no handler).
Bandwidth limiter – a sliding-window byte counter per channel, configurable per permission. Bursts above the budget cause writes to slow down rather than disconnect.
Message rate limiter – same shape, but counting messages instead of bytes. Useful against pathological tight loops that ship lots of small messages.

How the build runs

Once the espctl channel is open and the build machine has received a BuildRequest message:

The build machine creates a workspace under /var/lib/aegis/workspace/{job_id}/.
If the request includes a project_bundle (a base64-encoded git bundle, <= 50 MB), the build machine writes it to a temp file and runs git clone <bundle-file> {workspace}/src outside the sandbox.
The build machine stages a clean sandbox configuration that:
- Mounts {workspace}/src read-write
- Mounts the relevant IDF version from the store read-only
- Mounts a small writable /tmp for build scratch space
- Drops all capabilities, denies network access, denies new mounts
Inside the sandbox, the build machine runs idf.py build (or whatever the recipe specifies).
As compilation proceeds, the build machine reads the child process’s stdout and stderr, multiplexes the lines into the pty channel as raw bytes, and sends structured PipelineEvent messages on the espctl channel (e.g. “phase: compiling, progress 0.42”).
When the build finishes, the build machine reads the resulting .bin file from the workspace, computes a SHA-256 over the contents, and ships the bytes back as chunks on the firmware channel (followed by a final manifest message containing the SHA-256 and total size).
After a configurable delay or when the peer disconnects, the build machine cleans up the workspace.

Wire format

Messages on the espctl channel are JSON by default for browser clients and bincode-encoded for native Rust clients. The build machine auto-detects the encoding from the first byte. The schema lives in the aegis-proto crate; it’s stable across minor versions.

The pty channel is raw bytes – no framing, no escape codes added by the build machine. Whatever the child process writes to its TTY ends up in the channel.

The firmware channel uses a tiny chunked framing: a header message declaring num_chunks + total_size + sha256, followed by N raw binary chunks.

Data queue cap and throughput

There’s a subtle point worth knowing if you’re tuning performance:

WebRTC data channels have a configurable per-channel send queue. Production builds cap that queue at 128 KB (test builds use 128 MB to avoid blocking unit tests, which can mislead casual benchmarks).

Over a 500 ms round-trip connection through a fallback relay server, this works out to roughly:

128 KB / 500 ms = 256 KB/s effective throughput

…which is fine for log streaming and small firmware images, but you’ll notice it on large *.bin files (~1 MB and up). Direct peer-to-peer connections without a relay are dramatically faster.

Failure modes

Connection never converges – the data channels never open. The build machine’s on_open handler never fires, but the peer connection state transitions to Failed after ~5 seconds. Always implement a fast-fail in the client side that watches for Failed/Disconnected/Closed states in parallel with waiting for on_open.

Sandbox failure – the sandbox refuses to start (missing capability, host-side configuration issue). The build fails immediately with a structured error on the espctl channel; the data channels stay open so the client can read it.

Build process exceeds memory – the sandbox’s memory limit kills the child process. The build machine reports this as a build failure with the OOM signal in the structured error. The data channels stay open.

Permission expires mid-build – the build machine refuses to issue new permissions after expiry, but in-flight builds run to completion. The build server does not attempt to revoke the permission retroactively. If you need a build to be interruptible, use build.cancel.

ESP-IDF MCP — User Manual