TL;DR

HeyGen open-sourced Hyperframes, a deterministic video rendering framework where you write HTML and get MP4 out. Apache 2.0 license, single npx hyperframes init install, and — the part that actually matters — it ships first-class agent skills for Claude Code, Cursor, Gemini CLI, and Codex. The repo pulled 9.2k stars and 736 forks in its launch window. If you have been trying to get AI agents to produce real, reproducible video, this is the missing primitive.

What's new

Every other modern video framework — Premiere, DaVinci, After Effects, even a lot of web-based editors — was designed for a human with a mouse. Agents cannot drag a clip, scrub a timeline, or click a keyframe. They can, however, write HTML extremely well.

Hyperframes leans hard into that asymmetry. A composition is a plain index.html file. Timing lives in data attributes (data-start, data-duration, data-track-index). Layers are DOM elements. GSAP timelines drive animation. Headless Chrome renders it. FFmpeg encodes it. Same HTML, same MP4, every single time.

It plays as-is — no React, no TSX, no build step, no proprietary DSL.

Why it matters

Agents already speak HTML as a native language. Getting them to author video used to mean gluing together Remotion components, a bundler, a render queue, and a bunch of real-time wall-clock animation libraries that drop frames under CPU pressure. Hyperframes collapses that into one format LLMs generate fluently on the first try.

The license is the other load-bearing decision. Apache 2.0 is OSI open source — no per-render fees, no seat caps, no company-size thresholds. That is explicitly a counter-positioning move against Remotion's source-available license, which requires paid tiers above small-team thresholds.

Technical facts

  • Input format: HTML + CSS + GSAP (or Lottie, Three.js, Anime.js, CSS animations — anything that can be seeked to a frame).
  • Data-attribute schema: root needs data-composition-id, data-width, data-height; clips need data-start, data-duration, data-track-index, and class="clip". GSAP timeline must be { paused: true } and registered on window.__timelines.
  • Render pipeline: headless Chrome (Puppeteer) + beginFrame capture API → piped to FFmpeg → 60fps MP4, MOV, or WebM.
  • Determinism by design: seek-driven, not wall-clock. Formula frame = floor(time * fps). Same input always produces byte-identical output — which makes CI snapshot testing and batch rendering actually work.
  • Requirements: Node.js 22+ and FFmpeg. Docker optional for reproducible renders.
  • Catalog: 50+ ready-to-use blocks and components — social overlays, shader transitions, data-viz, cinematic effects.

Package architecture

PackageWhat it does
hyperframesCLI — create, preview, lint, render
@hyperframes/coreTypes, HTML parser, linter, runtime, frame adapters
@hyperframes/engineSeekable page-to-video capture (Puppeteer + headless Chrome)
@hyperframes/producerFull pipeline: capture + FFmpeg encode + audio mix
@hyperframes/studioBrowser-based composition editor UI
@hyperframes/playerEmbeddable <hyperframes-player> web component
@hyperframes/shader-transitionsWebGL shader transitions

The agent skills are the real unlock

Other HTML-to-video projects have existed. What makes Hyperframes production-ready for agents is that it ships framework-specific skills for the major coding agents: Claude Code, Cursor, Gemini CLI, and Codex (exposed as an OpenAI Codex plugin). These are not generic HTML docs scraped into a system prompt — they encode Hyperframes' actual patterns: the data-attribute schema, the GSAP sequencing rules, the caption styling, the non-interactive CLI conventions.

The skills auto-install on project init:

npx hyperframes init my-video

In Claude Code they register as slash commands:

  • /hyperframes — composition authoring
  • /hyperframes-cli — CLI operations
  • /gsap — animation help

The result: the agent loads the correct rules explicitly and produces a working composition on the first attempt, instead of hallucinating an API that does not exist.

Comparison — Hyperframes vs Remotion

Hyperframes is explicitly inspired by Remotion — HeyGen used Remotion in production, and the source keeps attribution comments for patterns it pioneered (Chrome launch flags, image2pipe → FFmpeg streaming, frame buffering). Both drive headless Chrome, both are deterministic. The fork point is the authoring bet.

HyperframesRemotion
AuthoringHTML + CSS + GSAPReact components (TSX)
Build stepNone — index.html runs as-isRequired bundler
Library-clock animations (GSAP, Anime.js)Seekable, frame-accuratePlays at wall-clock during render
Arbitrary HTML/CSS passthroughPaste and animateRewrite as JSX
Distributed renderingSingle-machine / Docker todayLambda, production-ready
LicenseApache 2.0 (OSI)Source-available, paid tier above small-team thresholds

Use cases

  • Website-to-video pipeline: a documented 7-step flow — capture URL → extract design tokens → script → storyboard → voiceover and timing → build composition → snapshot-validate → render.
  • Data and financial viz: point the framework at live DOM, animate charts, render MP4 for YouTube market reports without building custom motion graphics.
  • HeyGen avatar pairing: photorealistic speaking avatar over dynamically rendered HTML backgrounds and lower-thirds, all triggered by a single CLI call.
  • Prompt-to-video formats: PDF → 45-second pitch video, CSV → bar-chart race, 9:16 TikTok hook with bouncy TTS-synced captions.
  • Iterative editing in natural language: "make the title 2x bigger, swap to dark mode, add a fade-out at the end" — the agent edits the HTML.

Limitations & availability

This is a young project — 45 releases in the launch window and the current version is v0.4.12. Early GitHub issues show launch-week install friction, and open PRs are tracking HDR support, publish/share flows, and richer authoring pipelines.

Two concrete gaps worth knowing before you commit:

  • No distributed cloud renderer. Today it is single-machine or Docker. For high-volume server rendering, Remotion Lambda still wins until HeyGen ships their own hosted option.
  • Preview can stutter on expensive CSS (large backdrop-filter, heavy shadow stacks). The final MP4 is unaffected because the render is seek-driven, not real-time — but it confuses people expecting preview fidelity to predict render fidelity.

Pricing is zero. Apache 2.0, commercial use at any scale, no per-render fees, no seat caps. Start with npx hyperframes init my-video or browse the catalog of 50+ blocks.

What's next

HDR and publish/share flows are already in the PR queue. The directional bet is clear: HeyGen is positioning Hyperframes as the deterministic compositing layer underneath its avatar and TTS stack, so expect the website-to-video pipeline to get more native integrations with HeyGen's audio and avatar APIs. For agent builders, the practical implication is simpler — video generation is no longer a GUI-bound task blocking autonomous pipelines.

Nguồn: github.com/heygen-com/hyperframes, hyperframes.heygen.com, official docs.