COW Fork: Zero-Copy Sandbox Cloning for AI Agents

Every AI sandbox today wastes the same resources the same way.

An RL training loop loads a 2 GB reward model, imports PyTorch, preprocesses a dataset. This takes five seconds. Then it evaluates 10,000 candidate programs, each in its own sandbox. With containers, each sandbox re-initializes from scratch: five seconds of setup for one second of work. The math is brutal: 10,000 sandboxes times five seconds of initialization is 14 hours of wasted compute, just loading the same model into the same framework ten thousand times.

The data tells the same story across every AI workload. Code evaluation benchmarks spend 80% of wall time on sandbox startup. Agent tool-calling loops pay a cold-start penalty on every invocation. Hyperparameter sweeps re-initialize identical training setups thousands of times. The sandbox is the bottleneck, and the bottleneck is initialization.

Today we are releasing COW fork for Sandlock. Initialize a sandbox once. Fork it ten thousand times. Each fork takes under 700 microseconds and shares every memory page with the original. Over 1,300 forks per second on a single core. To our knowledge, this is the first AI sandbox to provide process-level copy-on-write forking as a first-class API.

What It Looks Like

from sandlock import Sandbox, Policy

def init():
    global model, dataset
    model = load_model("reward_model.pt")     # 2 GB, loaded once
    dataset = load_dataset("eval_set.pt")     # 500 MB, loaded once

def work():
    seed = int(os.environ["SEED"])
    result = evaluate(model, dataset, seed)   # reads COW-shared globals
    save_result(result)

policy = Policy(
    fs_readable=["/usr", "/lib", "/etc"],
    fs_writable=["/tmp"],
    max_memory="256M",
    max_processes=5,
)

with Sandbox(policy, init, work) as sb:
    for seed in range(10_000):
        sb.fork(env={"SEED": str(seed)}).wait()

Three functions. init() runs once, loads the model, prepares the data. work() runs in each clone, reads the shared state, produces a result. sb.fork() creates a new clone in 100 microseconds. Ten thousand clones share 2.5 GB of model and dataset memory. Total memory for the model across all clones: 2 GB. Not 20 TB.

Why This Was Not Possible Before

Every existing sandbox technology has the same structural limitation: each sandbox gets its own memory space, initialized from scratch.

Containers isolate processes via kernel namespaces (mount, PID, network, user). This provides strong boundaries, but it also breaks the page table sharing that makes copy-on-write work. A process inside a container lives in a different virtual address space than the host. There is no way to fork() a container from the outside and inherit its in-memory state. To “clone” a container, you must either snapshot the filesystem and cold-start a new one (losing all in-memory state), or use CRIU to checkpoint and restore the full process state (approximately 100,000 lines of code, requires root and kernel patches, adds hundreds of milliseconds per cycle).

MicroVMs (Firecracker, QEMU) run a separate guest kernel. Each VM has its own physical memory region. Cloning a VM means snapshotting guest memory and creating a new VM from the snapshot. This is faster than container cold-start but still measured in hundreds of milliseconds, and requires KVM and root access.

gVisor intercepts every syscall through a user-space kernel reimplementation. Each sandbox runs in its own Sentry process with its own address space. No memory sharing between sandboxes.

The common thread: all these approaches create isolation by placing the sandboxed process in a separate address space. This is exactly what prevents COW page sharing. Isolation and sharing are in tension, and every existing design chose isolation at the cost of sharing.

Sandlock resolves this tension by using a different isolation mechanism entirely.

How It Works

Sandlock confines processes using the kernel’s own security primitives: Landlock for filesystem and network access control, seccomp-bpf for syscall filtering, and seccomp user notification for resource limits. These mechanisms operate within the process’s existing address space. They do not create new namespaces and they do not break page table sharing.

This means fork() works exactly as the kernel designed it: the child process gets a copy-on-write view of the parent’s entire address space. Model weights, dataset buffers, Python interpreter state, imported modules, JIT caches. All shared at the physical page level. All isolated by Landlock, seccomp, and process group boundaries.

The implementation has no exotic dependencies:

Template process (main thread):
    init()                           # user's setup, runs once
    while True:
        cmd = os.read(control_fd)    # blocks, GIL released
        if cmd == TRIGGER_FORK:
            env = read_env()
            pid = os.fork()          # kernel creates COW clone
            if pid == 0:
                setpgid(0, 0)        # own process group
                os.environ.update(env)
                work()               # user's work function
                os._exit(0)
            else:
                send_pid(pid)        # tell parent about new clone

After init() returns, the main thread enters a fork-ready loop. It blocks on os.read(), which releases the GIL. No CPU is consumed while waiting. When the parent sends a fork command, the main thread calls the raw fork(2) syscall, which bypasses the seccomp notification path entirely for minimal kernel overhead. No signals. No ptrace. No machine code injection. No register manipulation. The main thread forks itself from inside a simple event loop.

Each clone inherits the template’s Landlock ruleset and seccomp filter. These are kernel-level restrictions that survive fork() and cannot be removed by the child. The clone is confined from its first instruction.

The Numbers

	Sandlock `fork()`	Container restart	MicroVM snapshot
Clone latency	~680 us	~200 ms	~150 ms
Forks per second	~1,300	~5	~7
Memory per clone (2 GB model)	~4 KB (page tables)	2 GB (full copy)	2 GB (guest RAM)
10,000 clones total memory	~2 GB	~20 TB	~20 TB
Root required	No	Yes (CRIU)	Yes (KVM)
State preserved	Full (heap, stack, fds)	Filesystem only	Full (with snapshot)
Lines of infrastructure code	0 (kernel `fork()`)	~100K (CRIU)	~50K (Firecracker)

680 microseconds per fork, measured end to end (parent sends command, child forks, parent receives clone PID). The fork itself uses the raw fork(2) syscall, which bypasses the seccomp notification path for near-zero kernel overhead. Over 1,300 forks per second sustained on a single core.

The per-clone memory overhead is the cost of a new set of page table entries, roughly 4 KB. The shared pages remain shared until written. For a read-heavy workload like model inference, most pages are never written, so the sharing persists for the clone’s entire lifetime.

Correctness Guarantees

COW fork is not a shortcut that trades safety for speed. Each clone provides the same isolation guarantees as a standalone sandbox:

Memory isolation. fork() creates a private address space. Writes in a clone do not affect the template or other clones. The kernel enforces this at the hardware level through page table permissions.

Confinement inheritance. Landlock rulesets and seccomp filters are inherited across fork() and cannot be removed. A clone cannot grant itself permissions that the template does not have.

Process group isolation. Each clone creates its own process group via setpgid(0, 0). Signals (SIGSTOP, SIGKILL) can target individual clones without affecting the template or other clones.

Environment isolation. Each clone receives its own environment overrides. The template’s environment is never modified because os.environ.update() triggers COW on the affected pages.

File descriptor isolation. The clone closes the control socket immediately after fork. It cannot send commands to the template or create additional clones.

Use Cases

RL rollouts. Load a reward model once, fork 10,000 clones with different random seeds. Each clone evaluates a candidate solution against the model and dataset. The model exists once in physical memory.

AI agent tool execution. An agent loads a large context window, knowledge base, and tool registry. Each tool call runs in a forked clone that inherits the full agent state via COW. The clone executes the tool in isolation and returns the result. No re-initialization between calls.

Code evaluation at scale. A benchmark harness loads test cases and reference implementations. Each candidate solution runs in a forked clone with memory caps and process limits. Crashes, infinite loops, and memory leaks are contained. The harness continues without interruption.

Hyperparameter search. A training setup function initializes the model architecture, data loaders, and optimizer state. Each hyperparameter configuration runs in a forked clone, starting from the exact same initialized state. No variation from re-initialization.

Getting Started

COW fork is available in Sandlock today:

pip install git+https://github.com/multikernel/sandlock.git

from sandlock import Sandbox, Policy

def init():
    global model
    model = load_model()

def work():
    seed = int(os.environ["SEED"])
    rollout(model, seed)

with Sandbox(Policy(fs_readable=["/usr","/lib","/etc"], fs_writable=["/tmp"]), init, work) as sb:
    for seed in range(1000):
        sb.fork(env={"SEED": str(seed)}).wait()

Sandlock requires Linux 5.13+ and Python 3.10+. No root, no cgroups, no container runtime, no CRIU. The project is open source under Apache 2.0.

We welcome contributions, bug reports, and feedback on GitHub.