PandaStack vs Fly.io Sprites: Firecracker Sandboxes

Ajay Kumar·June 17, 2026·9 min read

PandaStack and Fly.io Sprites are architectural cousins. Both run your code inside Firecracker microVMs — real hardware-virtualized guests, not shared-kernel containers — so on the isolation question that matters most for untrusted or AI-generated code, they're peers. Where they diverge is the bet each one makes on top of that primitive. Sprites are persistent-by-default machines that keep their filesystem indefinitely and scale to zero when idle; PandaStack restores a baked snapshot on every create (179ms p50, no warm pool) and treats forking as a first-class copy-on-write primitive. PandaStack's core is also open-source under Apache-2.0 and built to self-host on your own KVM hosts. This post draws the honest parallel and is careful where the public record on Sprites is thin.

I'm the founder of PandaStack, so treat this as a vendor's comparison. I state specific numbers only for PandaStack. Fly.io Sprites is new (it launched in early 2026), so I describe it in general, qualitative terms rather than inventing latencies, prices, or internals — and I flag where details are reported by press rather than confirmed. Verify anything about Sprites against Fly's own docs and sprites.dev at the time you read this, because a young product's specifics change fast. If you're coming from E2B and searching 'e2b vs fly.io sprites,' the framing here applies: all three are Firecracker; the differences are everything above the hypervisor.

Isolation model: both are real Firecracker microVMs

Start here because it's the dimension people most often get wrong. Both PandaStack and Fly.io Sprites run each sandbox inside a Firecracker microVM. That means every sandbox has its own guest kernel and is isolated by hardware virtualization (KVM), not by namespaces over a shared host kernel. Fly leans hard on exactly this for its agent-safety messaging — 'VMs, not containers' — and it's a fair claim. PandaStack makes the same one. A container shares the host kernel, so a kernel-level escape is a host compromise; a microVM contains a far smaller, far better-audited attack surface (the VMM).

So if your evaluation bar is 'is this safe to run arbitrary LLM-written code in,' both clear it. The interesting differences are all downstream of an isolation model the two share. That shared foundation is also why this is a genuinely close comparison rather than a strawman — these are two serious teams making different product bets on the same primitive.

The core split: persistent-by-default vs snapshot-restore-on-create

This is the single most defensible contrast between the two, and it's not FUD — it's two different, legitimate bets. Fly positions Sprites as persistent Linux computers: the filesystem survives indefinitely between sessions, so installed packages, files, and running services stay as the agent left them, and the machine scales to zero when idle (with billing stopping while inactive and state restored on resume). The pitch is 'agents keep their state between sessions' and 'stop paying for idle environments.' That's a real, differentiated design.

PandaStack makes the opposite bet. There is no warm pool of idle VMs and no long-lived per-agent machine by default. Every create restores a baked Firecracker snapshot on demand — a snapshot that already contains a booted kernel, a running guest agent, and an open network stack. 'Starting' a sandbox is really 'restore memory pages and resume,' which lands at 179ms p50 (p99 ~203ms). The only slow path is the very first spawn of a brand-new template, which does a real cold boot (~3s) and bakes the snapshot; every create after that is on the fast restore path.

Which bet fits depends on your workload shape. If an agent is a long-running entity that mutates one environment over hours or days and you want that exact state to persist cheaply when idle, the persistent-machine model is a clean match. If you spin up a fresh, identical environment per task — per request, per test, per rollout — and want each one to start in well under a quarter second from a known-good baked state, snapshot-restore-on-create is the model built for that. PandaStack does offer persistence where you need it (durable volumes, and a scale-to-zero auto-hibernate path for hosted apps), but the default and the optimized path is fast restore, not a persistent box.

Forking and copy-on-write state

Snapshotting is where the microVM model pays off in ways containers can't match, and it's where PandaStack invests hardest. PandaStack exposes both full snapshots and forks as first-class primitives. A snapshot captures the full machine state — memory plus rootfs. A fork clones a running sandbox using copy-on-write: guest memory is shared via MAP_PRIVATE (the kernel only copies pages on write) and the rootfs is cloned with an XFS reflink, so data is shared until something writes to it.

Concretely, a same-host fork completes in about 400ms; a cross-host fork (which streams the snapshot from object storage and restores) runs 1.2–3.5s. The use case this unlocks: get an environment into a known state once — dependencies installed, a dataset loaded, a REPL warmed — then fork it N times to explore branches in parallel, each starting from the exact same memory state without re-running setup. If your workload is tree-search, agent rollouts, or 'try five fixes and keep the one that passes,' that branching primitive is the thing to evaluate hardest. See /docs/concepts/snapshots-and-forks for the full API.

Fly's public material describes fast checkpoint/restore with rollback and a retained checkpoint history for Sprites, which sounds genuinely useful for agent experimentation and recovery. I'm deliberately not going to assert the mechanism behind it or quote a hard latency — the internal implementation and the exact numbers aren't something I can verify, and the reported figures vary. If branch-and-fork is core to your workload, test Sprites' checkpoint/restore semantics directly against the parallel-branching pattern you actually run, rather than taking my characterization (or theirs) on faith.

Open-source and self-hosting

This is the cleanest structural difference, and it's worth stating precisely. The PandaStack core is open-source under Apache-2.0 and is designed to be self-hosted. You run the control-plane API and a per-host agent on your own Linux KVM hosts (anything with /dev/kvm), and your sandboxes execute entirely on your infrastructure. There's a hosted offering too, but self-host is a first-class, supported path — the same binaries, the same agent.

Fly.io Sprites, as of this writing, is a hosted Fly.io cloud service: not open-source and not self-hostable. A Fly developer has publicly said they intend to ship an open-source local version 'relatively soon,' but that is a forward-looking statement, not a shipped artifact — so don't plan around it, and check Fly's own announcements for the current status. If you read elsewhere that Sprites is open-source today, that's the intention being mistaken for the release. The practical takeaway: if data residency, VPC isolation, an auditable execution layer, or running code entirely inside your own perimeter is a hard requirement now, that's a structural point for PandaStack. If you're happy on a hosted platform, it's a non-issue.

Platform breadth: one microVM substrate, one bill

Sandboxes are ephemeral by design (or, in Sprites' case, persistent but still just compute), so the interesting question is what else lives on the same isolation substrate. PandaStack is positioned as a microVM platform rather than only a sandbox, and that breadth is the most honest place to differentiate:

Managed PostgreSQL 16 — each database is its own dedicated Firecracker microVM with a durable volume, pgvector and other extensions, PgBouncer pooling, and connectivity over native postgres:// (via SNI routing) or an HTTP query broker for edge functions.
Git-driven app hosting — connect a repo and PandaStack auto-detects the framework (next/vite/cra/node/static/python), does blue-green deploys, scales to zero via auto-hibernate, and supports GitHub push-to-deploy. A Vercel/Render-style flow on the same microVM substrate.
Serverless functions with cron schedules — code bundles you invoke directly or over HTTP, with scheduled triggers.
Durable volumes — persistent disk for sandboxes that need state beyond the ephemeral copy-on-write rootfs.

The point isn't 'more features = better.' Fly is a mature platform with its own breadth — a global edge network, an established Firecracker operations track record, and a broad app-deploy story that long predates Sprites. If you're already running on Fly's network, that gravity is real and legitimate. PandaStack's argument is narrower and specific: if you're building an AI product that also needs a database per tenant and somewhere to host the app, having it on one isolation substrate and one bill is the pitch. If all you need is to run code, that breadth is irrelevant to you.

SDKs and developer experience

PandaStack ships a Python SDK (pandastack), a TypeScript SDK (@pandastack/sdk), and a CLI (pandastack). The client reads a PANDASTACK_API_KEY (keys use a pds_ prefix) and talks to a configurable base URL, so pointing the same code at the hosted API or your own self-hosted control plane is a config change, not a rewrite. Here's the canonical create-exec-read-fork flow:

import os
from pandastack import PandaStack

client = PandaStack(token=os.environ["PANDASTACK_API_KEY"])  # base URL configurable

# Create a sandbox from a template (179ms p50 via snapshot-restore)
sandbox = client.sandboxes.create(
    template="code-interpreter",
    ttl_seconds=600,
    metadata={"task": "data-analysis"},
)

# Run code inside the microVM
result = sandbox.exec("python -c 'print(2 ** 10)'", timeout_seconds=30)
print(result.stdout)  # -> 1024

# Read and write files
sandbox.filesystem.write("/tmp/notes.txt", "hello from inside the VM")
print(sandbox.filesystem.read("/tmp/notes.txt"))

# Fork into N parallel branches from the same warmed state (~400ms same-host)
branch = sandbox.fork()

Sprites ships with a strong agent-DX story out of the box — by Fly's account, the Claude Code, Codex, and Gemini CLIs plus current Python and Node runtimes come pre-installed, and there's a unified developer-and-API surface. That's a legitimate strength, especially if your agents shell into a persistent box and you want the tooling already there. SDK and CLI ergonomics are subjective, so I'd build a one-hour spike against whichever pair you're weighing before committing — you'll know quickly which model fits how your agents actually run.

Templates: what each sandbox ships with

PandaStack ships a set of baked templates so you're not building images on day one: base (Node, Python, Go, Bun via mise), code-interpreter (a Python scientific stack), agent (Claude Code, Codex, OpenCode CLIs), browser (Chromium with Playwright), postgres-16 (the managed database template), and claude-agent. You can also bake your own — the first spawn cold-boots and snapshots it, and every create after is on the fast restore path. The takeaway for the comparison is just that there's a sensible default catalog covering code execution, agentic tooling, and browser automation.

The constraint worth naming up front

Snapshot-restore has a deliberate trade-off: the guest's vCPU and RAM are fixed at bake time. You can't resize a VM at restore — if you want a 4 GiB guest, you bake a 4 GiB template. That's a consequence of restoring a frozen memory image, not a bug, but it's the kind of thing you want to know going in, and it's a place where a persistent-machine model that you provision once and keep can feel more flexible. Name your sizing needs early and bake templates to match.

When to pick which — honestly

Pick PandaStack when:

You spin up fresh, identical environments per task and want sub-quarter-second create() from a known-good baked state, with no warm pool to manage.
Forking is core to your workload — parallel agent rollouts or branch-and-test patterns where ~400ms same-host copy-on-write forks from a warmed state are the unlock.
You need to self-host — data residency, compliance, VPC isolation, an auditable execution layer, or cost control at scale make a hosted-only service a non-starter, and you have (or want) an infra team to run KVM hosts.
You want more than execution on one substrate — managed Postgres per tenant, git-driven app hosting, and functions, all on the same microVM isolation and one bill.

Pick Fly.io Sprites when:

You're already on Fly's edge network and want their global deploy story and operational track record under your agents — that gravity is real and switching has a cost a marginal feature won't justify.
Your agents are long-lived entities that mutate one environment over time, and persistent-by-default state that scales to zero when idle matches how you think about them better than ephemeral restore.
You want a fully hosted platform with nothing to operate — no KVM hosts, agents, or snapshot storage to babysit — and the pre-installed agent tooling and unified developer/API surface fit how your team works.
After a hands-on spike, their checkpoint/restore semantics and platform specifics fit your stack better — verify the current details on Fly's docs first, since the product is young.

Don't choose on a feature matrix alone — and especially not on a young product's marketing numbers. Build a one-hour spike against both: measure create() and your fork/restore pattern in your own region and on your real workload, not from a docs table. Re-check Sprites' current pricing, idle behavior, latencies, and open-source status against Fly's own pages at the time you decide, because those specifics move on a new product. The right answer depends on your workload, not on whose blog post you read last.

The bottom line

PandaStack and Fly.io Sprites agree on the most important thing — Firecracker microVMs are the correct isolation model for running untrusted and AI-generated code — and then make opposite bets on top of it. Sprites bet on persistence: long-lived machines that keep their state and scale to zero, on Fly's hosted edge platform. PandaStack bets on fast, stateless restore: 179ms p50 snapshot-restore on every create with no warm pool, first-class ~400ms same-host copy-on-write forks, an open-source Apache-2.0 core you can self-host on your own KVM hosts, and a broader platform spanning managed PostgreSQL, git-driven app hosting, and serverless functions. Neither bet is wrong; they fit different workloads. If you want to widen the search, /blog/e2b-alternatives surveys the landscape and /blog/pandastack-vs-e2b walks the same comparison against E2B. Whichever way you lean, prototype against both and let your own measurements decide.

Frequently asked questions

Are PandaStack and Fly.io Sprites both built on Firecracker?

Yes. Both run each sandbox inside a Firecracker microVM, so every sandbox gets its own guest kernel and hardware-level (KVM) isolation rather than a shared-kernel container — the right isolation model for untrusted or AI-generated code. The differences are above the hypervisor: PandaStack restores a baked snapshot on every create with no warm pool and treats forking as a first-class primitive, while Fly positions Sprites as persistent-by-default machines that scale to zero when idle. Verify Sprites' specifics against Fly's own docs.

Is Fly.io Sprites open-source or self-hostable?

As of this writing, no — Sprites is a hosted Fly.io cloud service. A Fly developer has publicly said they intend to release an open-source local version 'relatively soon,' but that is a stated intention, not a shipped product, so don't plan around it and check Fly's announcements for the current status. PandaStack, by contrast, has an open-source core under Apache-2.0 that is designed to self-host: you run the control-plane API and a per-host agent on your own Linux KVM hosts, and your sandboxes execute on your own infrastructure.

What's the real difference between PandaStack and Fly.io Sprites?

The core split is the bet each makes on top of Firecracker. Sprites are persistent-by-default machines: the filesystem survives indefinitely between sessions and the machine scales to zero (with billing stopping) when idle, which suits long-lived agents that mutate one environment over time. PandaStack has no warm pool and no default persistent box — it restores a baked snapshot on every create at 179ms p50, which suits spinning up fresh, identical environments per task. PandaStack also offers first-class ~400ms same-host copy-on-write forking, Apache-2.0 self-hosting, and a managed-services platform.

How does this compare to E2B vs Fly.io Sprites?

All three run code in Firecracker microVMs, so the isolation primitive is shared and the comparison lives above it. E2B is a focused, mature, hosted-first sandbox; Sprites is Fly's newer persistent-by-default microVM product on its edge network; PandaStack restores a baked snapshot on every create (179ms p50, no warm pool), offers first-class ~400ms same-host forking, and is open-source under Apache-2.0 so you can self-host on your own KVM hosts. See /blog/pandastack-vs-e2b for the E2B-specific breakdown and /blog/e2b-alternatives for the wider landscape, and benchmark create() and fork in your own region before deciding.

Run code in a microVM in one API call.

49ms p50 cold start. Fork, snapshot, and scale to zero.

Start free

Written by Ajay Kumar, Founder, PandaStack.