09 — WASM Sandbox Managed Hosting (untrusted server modules)¶
How Lattice runs untrusted, customer-authored server modules safely under a managed-hosting model, and how the same game-sim source also ships as a trusted native binary for self-hosting / P2P — write once, target both.
This document is grounded in a working implementation under
host/lattice-wasm-host/(the runner, the dual-compile sample, the admission scan, the test suite) andcontrol-plane/lattice-console/(the upload → scan → register flow). Everything claimed "verified" below has pasted output in §9.
Read alongside:
02 §18 Shared simulation library ·
02 §19 Public C ABI ·
07 Custom Module Guide ·
C16 mediated web fetch (lattice_http_*).
1. The problem¶
The custom-module model (07) lets a developer author
one game-sim in C++ and link it against liblattice to produce a dedicated
server or a P2P host. That is perfect when the developer runs their own code on
their own fleet — they already trust it.
Managed hosting is different. Here the platform runs the customer's server
module on Lattice infrastructure, next to other tenants. The module is now
untrusted code on our machines. A native .so linked into our process can do
anything the process can: open sockets, read the filesystem, read another
tenant's memory, spawn processes, exfiltrate data. We cannot run untrusted native
code in-process. We need an isolation boundary.
2. The boundary is ISOLATION, not scanning¶
The single most important design decision:
Safety comes from the sandbox, not from inspecting the binary for "bad" code.
Malware scanning / static "does this look malicious" analysis is a losing arms race: an attacker iterates against your detector until it passes. We do not rely on it. Instead we run the untrusted module inside a WebAssembly sandbox, which is a capability boundary: a wasm module has its own linear memory, has no ambient authority (no syscalls, no file handles, no sockets, no pointers into host memory), and can affect the outside world only through the imports the host explicitly grants it. If we grant nothing but a fixed set of Lattice host functions, then by construction the module can do nothing else — regardless of what code it contains.
The admission scan we do run (§6)
is therefore not a malware detector. It is a static proof that the sandbox
boundary is the only boundary: it confirms the module imports nothing outside the
granted capability set. A module that tries to import env.system is rejected not
because "system" is on a blocklist, but because it is not on the allowlist of
things the host offers.
3. Write-once / dual-target: the headline model¶
One game-sim source file compiles, with no code change, into both:
| Target | How | lattice_* calls become |
Tier |
|---|---|---|---|
| Native | clang/g++ + link liblattice |
linked C calls into the real core | self-host / P2P (trusted; developer's own fleet) |
| WASM | clang --target=wasm32 -nostdlib -Wl,--no-entry -Wl,--allow-undefined |
undefined imports env.lattice_ms_* |
managed hosting (untrusted; our fleet, sandboxed) |
The module is authored against one narrow C ABI surface
(host/lattice-wasm-host/include/lattice_managed.h, the managed-sim surface — a
WASM-crossable projection of the full §16/§17/§18 + C16 ABI: register a replicated
type, spawn authoritative objects, mutate+replicate fields, send/receive
RPCs+events, issue a mediated fetch). The exact same calls:
- Native: resolve at link time to the shim in
native_host/lattice_ms_native.cpp, which drives a livelattice_runner(lattice_register_type/lattice_spawn/lattice_object_state+lattice_object_mark_dirty/lattice_rpc/lattice_send_event/lattice_http_request). The native build genuinely runs on the reference core. - WASM: are left undefined and the linker emits them as
env.lattice_ms_*function imports. The managed host is the only thing that satisfies them, so the sandboxed module reaches the core and nothing else.
This is why managed hosting and self-hosting are not two codebases. They are one source, two link/compile backends, one C-ABI capability surface.
Why a narrow subset and not the full
lattice.h? The full ABI returns raw host pointers (lattice_object_state), takes struct-of-pointers descriptors, and pushes host→module callbacks through C function pointers — none of which cross a wasm32 sandbox boundary unchanged (a guest has its own memory and cannot share a host pointer). The managed surface is reshaped to the lowest common denominator that is identical on both targets: state lives in the module's memory; everything crosses as scalars or as a (ptr,len) byte blob the host copies immediately (exactly the shapelattice_rpc/lattice_send_event/lattice_http_requestalready use); host→module delivery is pulled by the module each tick (lattice_ms_poll_event) rather than pushed through a callback pointer. It is a faithful projection, not a different model.
3.1 Three roles × two packagings¶
The §18 role model (SERVER / HOST / CLIENT) is orthogonal to packaging.
Role is injected at runtime (lattice_runner_start(mode)); packaging is chosen at
build/deploy time. A managed-hosted authority is SERVER-role and WASM-
packaged; a P2P host is HOST-role and native-packaged; both run the same
sim source.
4. Architecture¶
authored once (C++), against lattice_managed.h
│
┌─────────────────────┴──────────────────────┐
│ (a) NATIVE │ (b) WASM
▼ ▼
game_sim.cpp + lattice_ms_native.cpp game_sim.cpp --target=wasm32
── link liblattice ──► game_sim_native ── no entry, imports undefined ──►
game_sim.wasm
│ │
real lattice_runner uploaded to control-plane
(self-host / P2P) │
┌─────────┴─────────┐
│ import-allowlist │ REJECT if any
│ admission SCAN │ import ∉ allowlist
└─────────┬─────────┘ / oversized
│ admit
┌─────────┴─────────┐
│ managed host: │ grants ONLY
│ WASM sandbox + │ env.lattice_ms_*;
│ lattice_ms_* │ drives init/tick;
│ host fns over │ mediated egress
│ liblattice │ gated by allow-list
└───────────────────┘
5. The capability / egress model¶
The module's entire set of capabilities is the list of host functions it is
granted — nothing more. They are enumerated in lattice_managed.h and the
allowlist (runner/allowlist.mjs, Modules/WasmImportScanner.cs):
lattice_ms_log, type registration, spawn/despawn, field get/set, object
enumeration, lattice_ms_rpc, lattice_ms_send_event, lattice_ms_http_request,
lattice_ms_poll_event.
Egress is the sharp edge. The one capability that can touch the outside world
is the mediated web fetch (lattice_ms_http_request → C16 lattice_http_request).
It is mediated and host-controlled:
- the egress allow-list (
host:portentries) is configured by the host, not the module. The module cannot widen its own egress. - empty allow-list ⇒ deny all (egress is opt-in).
- a request to a non-allow-listed host is rejected immediately with a negative
reason (
LATTICE_HTTP_NOT_ALLOWED); an allow-listed request is admitted, then rate-limited (token bucket). This is the SSRF / exfiltration defence: the server cannot be turned into an arbitrary outbound proxy.
The module learns the outcome (it gets a result), but it cannot reach anything the host did not pre-authorize. Verified in §9 T5.
6. Import-allowlist admission (the real safe-binary check)¶
A wasm module affects the world only through (1) its imports and (2) the host's wiring of its exports — and the host controls (2) entirely. So the admission gate is purely about (1):
Enumerate the module's import section. Reject if any import is outside the allowlist: module must be
env; a function import's name must be a knownlattice_ms_*host function; tables / globals / a second memory are forbidden. Enforce a size limit and require the host-called exports.
This is implemented twice, identically in spirit:
- Runtime / demo host —
runner/allowlist.mjsusesWebAssembly.Module.imports()(compilation reads the import section without executing the module). - Control plane (upload time) —
Modules/WasmImportScanner.cshand-parses the wasm preamble + import section (LEB128 walk), with no wasm engine dependency on the server and no instantiation. Verified to agree with the JS scanner on real clang output (§9 T6).
The production C++/wasmtime host enforces the same set a third time via its linker
(it simply does not define any import outside the allowlist). The check is a
property of the .wasm itself, so it is identical regardless of runtime.
A tampered module importing env.system / env.fd_write is rejected with the
offending imports named; it is never stored or made schedulable.
7. Platform flow: upload → scan → register → schedule¶
control-plane/lattice-console:
- Upload (
POST /games/{gameId}/modules, dev-scoped by RBAC — only the game owner / super-admin). Body:{ packaging: "wasm"|"native", version, wasmBase64 }. - Scan / record:
- wasm →
WasmImportScanner.Scan. Admitted ⇒ stored with the seen imports, SHA-256, size, andschedulable = true. Rejected ⇒400with the reasons, and nothing is stored. - native → recorded for the self-host/P2P tier with no sandboxing claim (import-scanning cannot isolate native code; see §8).
- Register: a
ModuleBuild(BuildId,GameId, packaging, version, hash,Admitted,Imports,Schedulable). - Schedule: the director places a
Schedulablebuild under its packaging tier. (Director wiring is out of scope here; the build record is the contract.)
8. Native microVM — the noted alternative tier¶
WASM is the default isolation for untrusted code because it is in-process, fast to start, and cheap to densely multi-tenant. But some modules need native performance, threads, or libraries the wasm subset doesn't offer. The alternative isolation tier for untrusted native modules is a microVM (Firecracker- style) or a hardened container/gVisor: one tenant per VM, a virtio/vsock-mediated syscall surface, and the same egress allow-list at the VM's network boundary.
This is explicitly a different boundary from the import-scan. We do not claim a native binary is safe because we scanned its ELF/PE — that scan is defence-in-depth only. Untrusted native = microVM; trusted native (the developer's own fleet) = self-host packaging with no isolation needed.
9. What is verified HERE vs deferred to production¶
Verified here (runnable):
- clang can target wasm32 in this environment (after
apt-get install clang lld), producing a reactor module whose imports are exactly the host functions. - Write-once / dual-target: one
sample/game_sim.cppcompiles to bothgame_sim_native(liblattice-linked) andgame_sim.wasm, and both produce the identical replicated field values and a bit-identical simulation checksum:
=== NATIVE ===
objects=4
obj[0] netid=1 x=8 y=12 score=80
obj[1] netid=2 x=26 y=12 score=80
obj[2] netid=3 x=44 y=12 score=80
obj[3] netid=4 x=62 y=12 score=80
CHECKSUM=17699989211919191625
INBOUND rpc=0 event=0 http=1
=== WASM ===
objects=4
obj[0] netid=1 x=8 y=12 score=80
obj[1] netid=2 x=26 y=12 score=80
obj[2] netid=3 x=44 y=12 score=80
obj[3] netid=4 x=62 y=12 score=80
CHECKSUM=17699989211919191625
INBOUND rpc=0 event=0 http=1
(The equality checksum covers the module-driven simulation, which must be identical; inbound-message arrival timing rides the host transport and is reported separately — both targets deliver the same mediated-fetch result.)
- Import-allowlist admission (T1–T3, runtime): the legit sample is admitted;
a tampered module importing
env.system/env.fd_writeis rejected with the offending imports named; an oversized module is rejected. - Egress is host-controlled (T5): with an empty allow-list the fetch is denied immediately; the module cannot widen its own egress; the simulation is identical regardless.
- Control-plane scan gate:
dotnet testis green (45 tests; +8 new), incl. a good wasm admitted + a tampered wasm rejected end-to-end through the upload endpoint, with RBAC scoping. The C# scanner agrees with the JS scanner on the real builtgame_sim.wasm/tampered_module.wasm.
Deferred to production (honest gaps):
- The runtime here is Node's
WebAssembly, becausewasmtimeis not installed in this environment. Thelattice_ms_*host functions inrunner/host.mjsare a JS model of the C-ABI host functions. Production is a C++/wasmtime host embeddingliblatticedirectly, whose imports call straight into the core (thenative_host/shim already is that bridge for the native-link tier). The Node host reproduces the same observable behaviour so the write-once proof holds. - No microVM here — the native-isolation tier (§8) is designed, not built.
- The web fetch is offline in this sandbox: the sample targets an allow-listed
127.0.0.1:9that refuses fast, yielding a deterministicCONNECT_FAILso the mediated path is exercised end-to-end without a live external server. Production issues the real request on a worker thread (C16), still egress-gated. - Native ELF/PE inspection is defence-in-depth only and is not the isolation boundary for untrusted native code.
10. Files¶
host/lattice-wasm-host/include/lattice_managed.h— the dual-target managed-sim C ABI (the capability surface).host/lattice-wasm-host/sample/game_sim.cpp— the one write-once sim source.host/lattice-wasm-host/sample/tampered_module.c— the malicious module the scan must reject.host/lattice-wasm-host/native_host/lattice_ms_native.cpp— native shim (lattice_ms_* over liblattice) + native harness.host/lattice-wasm-host/runner/allowlist.mjs— import-allowlist + limits scan.host/lattice-wasm-host/runner/host.mjs— the Node managed host (sandbox runner).host/lattice-wasm-host/test/run-tests.mjs— the runtime test suite (T1–T5).host/lattice-wasm-host/build.sh— dual-compile (native + wasm + tampered).control-plane/lattice-console/src/.../Modules/WasmImportScanner.cs— server-side scan.control-plane/lattice-console/src/.../Services/ModuleBuildService.cs,Domain/ModuleBuild.cs, endpoints inProgram.cs— upload → scan → register.control-plane/lattice-console/tests/.../ModuleScanGateTests.cs— the scan gate tests.