Skip to content

Computer Use (CDP + VNC)

A computer-use sandbox is just a regular isorun microVM running an image that has Chromium + Xvfb + x11vnc + noVNC pre-installed. The public-port proxy exposes Chrome’s DevTools Protocol (CDP) and the noVNC web UI through the same https://run<id>.isorun.ai/sb/p/<port>/ URLs as any other in-sandbox service.

You don’t need to know or care about the bind address. Chromium 124+ ignores --remote-debugging-address=0.0.0.0 and always binds its CDP server to 127.0.0.1 for security. Same for Jupyter, PostgreSQL, Redis defaults, and a long tail of services that default to localhost-only. The runner detects this on the first sb.url(port) request and asks the in-VM agent to start a transparent TCP forwarder from the guest’s eth0 IP to 127.0.0.1. Subsequent requests skip that step. Run your service with default flags — it just works.

What you get:

  • CDP on port 9222 — connect Playwright, pyppeteer, Anthropic Computer Use, OpenAI Operator, or any other client that speaks the Chrome DevTools Protocol.
  • noVNC web UI on port 6080 — open in any browser, watch the agent operate the desktop in real time. WebSocket pass-through is end-to-end so the live screen update works without extra config.

Build the image (one-time)

The repo ships a Dockerfile for the computer-use image at images/computer-use/. Build and push it once to your registry (Docker Hub, GHCR, ECR, anything crane can pull from):

Terminal window
cd images/computer-use
REGISTRY=docker.io/your-account ./build.sh

This produces docker.io/your-account/isorun-computer-use:1.0. You can also pin a different tag, build it on a CI runner, or write your own Dockerfile — the only requirements are:

  • Chromium with --remote-debugging-port=9222 --remote-debugging-address=0.0.0.0
  • Xvfb on :99 (the image already provides this)
  • A startup script at /usr/local/bin/start-vnc.sh that brings them up in the background

The shipped Dockerfile is ~700 MB on disk (Debian slim + chromium + fonts + noVNC). On first use isorun pulls it once and caches a golden snapshot — subsequent boots take the standard ~30 ms.

Use it

from isorun import Sandbox
with Sandbox(
"docker.io/your-account/isorun-computer-use:1.0",
vcpus=2,
mem_mib=2048,
) as sb:
# The runner's init runs the isorun agent, not the image's CMD,
# so we kick off the desktop ourselves. & disowns it so the exec
# call returns immediately.
sb.exec("/usr/local/bin/start-vnc.sh &")
# Wait a moment for chromium + noVNC to bind
import time; time.sleep(2)
print("CDP :", sb.cdp_url()) # → https://run<id>.isorun.ai/sb/p/9222/
print("noVNC :", sb.vnc_url()) # → https://run<id>.isorun.ai/sb/p/6080/
# Hand the URL to the user, or drive Chromium yourself.
input("Press enter to destroy…")

Drive it with Playwright

from playwright.sync_api import sync_playwright
from isorun import Sandbox
with Sandbox("docker.io/your-account/isorun-computer-use:1.0",
vcpus=2, mem_mib=2048) as sb:
sb.exec("/usr/local/bin/start-vnc.sh &")
import time; time.sleep(2)
with sync_playwright() as p:
browser = p.chromium.connect_over_cdp(sb.cdp_url())
ctx = browser.contexts[0] if browser.contexts else browser.new_context()
page = ctx.new_page()
page.goto("https://example.com")
print(page.title())
page.screenshot(path="/tmp/screenshot.png")

Drive it with Anthropic Computer Use

The Anthropic Computer Use beta accepts a CDP endpoint. Pass it sb.cdp_url() and the model will operate Chromium inside the sandbox.

import anthropic
from isorun import Sandbox
with Sandbox("docker.io/your-account/isorun-computer-use:1.0",
vcpus=2, mem_mib=2048) as sb:
sb.exec("/usr/local/bin/start-vnc.sh &")
import time; time.sleep(2)
client = anthropic.Anthropic()
response = client.beta.messages.create(
model="claude-opus-4-6",
max_tokens=1024,
tools=[{
"type": "computer_20250124",
"name": "computer",
"display_width_px": 1280,
"display_height_px": 800,
"cdp_url": sb.cdp_url(),
}],
messages=[{"role": "user", "content": "Open example.com and read me the page title."}],
betas=["computer-use-2025-01-24"],
)
print(response.content)

Watch the agent in real time

Open sb.vnc_url() in any browser. You get the noVNC web UI showing the live Chromium session — the agent’s mouse moves, page loads, typed input, all of it. Useful for debugging agent loops and for building “supervised computer use” UIs where a human monitors the agent visually.

Behind the scenes: noVNC is HTML/JS running in your browser, talking to a websockify proxy on the sandbox’s port 6080, which converts the WebSocket frames to TCP for x11vnc on port 5900. The whole chain (HTML browser → CF → cloudflared → runner proxy → guest noVNC → websockify → x11vnc → Xvfb) passes through cleanly because every hop in the runner-side proxy supports WebSocket upgrade.

Resources

  • vCPUs: 2 minimum (Chromium hates being CPU-starved)
  • Memory: 2 GiB minimum, 4 GiB if you’ll have many tabs open
  • Disk: default 3 GiB scratch is enough; bump to 10 GiB if you expect downloads or large session storage

Why this is fast

  • Image build: one-time, on your CI machine. Push to a registry.
  • Sandbox boot: same ~30 ms cold start as any other image, because the golden snapshot already has Chromium’s pages faulted in via the pre-warm step.
  • CDP connection: sub-100 ms from a colocated client, dominated by the cloudflared tunnel hop.
  • Per-action latency (click, screenshot, type): 50-150 ms, bounded by CDP round-trip + Xvfb redraw — the same as any local Playwright session.

A computer-use sandbox costs about the same as a regular sandbox (~$0.082/hr at 2 vCPU + 2 GiB), and like every other isorun sandbox you can sb.hibernate() it when idle and pay zero until the user comes back.