- HeyGen just released Hyperframes under Apache 2.0 — a video framework where HTML is the composition, Puppeteer + FFmpeg is the renderer, and AI agents are the target user.
- 9.2k stars in the launch window.
TL;DR
HeyGen open-sourced Hyperframes, a deterministic video rendering framework where you write HTML and get MP4 out. Apache 2.0 license, single npx hyperframes init install, and — the part that actually matters — it ships first-class agent skills for Claude Code, Cursor, Gemini CLI, and Codex. The repo pulled 9.2k stars and 736 forks in its launch window. If you have been trying to get AI agents to produce real, reproducible video, this is the missing primitive.
What's new
Every other modern video framework — Premiere, DaVinci, After Effects, even a lot of web-based editors — was designed for a human with a mouse. Agents cannot drag a clip, scrub a timeline, or click a keyframe. They can, however, write HTML extremely well.
Hyperframes leans hard into that asymmetry. A composition is a plain index.html file. Timing lives in data attributes (data-start, data-duration, data-track-index). Layers are DOM elements. GSAP timelines drive animation. Headless Chrome renders it. FFmpeg encodes it. Same HTML, same MP4, every single time.
It plays as-is — no React, no TSX, no build step, no proprietary DSL.
Why it matters
Agents already speak HTML as a native language. Getting them to author video used to mean gluing together Remotion components, a bundler, a render queue, and a bunch of real-time wall-clock animation libraries that drop frames under CPU pressure. Hyperframes collapses that into one format LLMs generate fluently on the first try.
The license is the other load-bearing decision. Apache 2.0 is OSI open source — no per-render fees, no seat caps, no company-size thresholds. That is explicitly a counter-positioning move against Remotion's source-available license, which requires paid tiers above small-team thresholds.
Technical facts
- Input format: HTML + CSS + GSAP (or Lottie, Three.js, Anime.js, CSS animations — anything that can be seeked to a frame).
- Data-attribute schema: root needs
data-composition-id,data-width,data-height; clips needdata-start,data-duration,data-track-index, andclass="clip". GSAP timeline must be{ paused: true }and registered onwindow.__timelines. - Render pipeline: headless Chrome (Puppeteer) +
beginFramecapture API → piped to FFmpeg → 60fps MP4, MOV, or WebM. - Determinism by design: seek-driven, not wall-clock. Formula
frame = floor(time * fps). Same input always produces byte-identical output — which makes CI snapshot testing and batch rendering actually work. - Requirements: Node.js 22+ and FFmpeg. Docker optional for reproducible renders.
- Catalog: 50+ ready-to-use blocks and components — social overlays, shader transitions, data-viz, cinematic effects.
Package architecture
| Package | What it does |
|---|---|
hyperframes | CLI — create, preview, lint, render |
@hyperframes/core | Types, HTML parser, linter, runtime, frame adapters |
@hyperframes/engine | Seekable page-to-video capture (Puppeteer + headless Chrome) |
@hyperframes/producer | Full pipeline: capture + FFmpeg encode + audio mix |
@hyperframes/studio | Browser-based composition editor UI |
@hyperframes/player | Embeddable <hyperframes-player> web component |
@hyperframes/shader-transitions | WebGL shader transitions |
The agent skills are the real unlock
Other HTML-to-video projects have existed. What makes Hyperframes production-ready for agents is that it ships framework-specific skills for the major coding agents: Claude Code, Cursor, Gemini CLI, and Codex (exposed as an OpenAI Codex plugin). These are not generic HTML docs scraped into a system prompt — they encode Hyperframes' actual patterns: the data-attribute schema, the GSAP sequencing rules, the caption styling, the non-interactive CLI conventions.
The skills auto-install on project init:
npx hyperframes init my-videoIn Claude Code they register as slash commands:
/hyperframes— composition authoring/hyperframes-cli— CLI operations/gsap— animation help
The result: the agent loads the correct rules explicitly and produces a working composition on the first attempt, instead of hallucinating an API that does not exist.
Comparison — Hyperframes vs Remotion
Hyperframes is explicitly inspired by Remotion — HeyGen used Remotion in production, and the source keeps attribution comments for patterns it pioneered (Chrome launch flags, image2pipe → FFmpeg streaming, frame buffering). Both drive headless Chrome, both are deterministic. The fork point is the authoring bet.
| Hyperframes | Remotion | |
|---|---|---|
| Authoring | HTML + CSS + GSAP | React components (TSX) |
| Build step | None — index.html runs as-is | Required bundler |
| Library-clock animations (GSAP, Anime.js) | Seekable, frame-accurate | Plays at wall-clock during render |
| Arbitrary HTML/CSS passthrough | Paste and animate | Rewrite as JSX |
| Distributed rendering | Single-machine / Docker today | Lambda, production-ready |
| License | Apache 2.0 (OSI) | Source-available, paid tier above small-team thresholds |
Use cases
- Website-to-video pipeline: a documented 7-step flow — capture URL → extract design tokens → script → storyboard → voiceover and timing → build composition → snapshot-validate → render.
- Data and financial viz: point the framework at live DOM, animate charts, render MP4 for YouTube market reports without building custom motion graphics.
- HeyGen avatar pairing: photorealistic speaking avatar over dynamically rendered HTML backgrounds and lower-thirds, all triggered by a single CLI call.
- Prompt-to-video formats: PDF → 45-second pitch video, CSV → bar-chart race, 9:16 TikTok hook with bouncy TTS-synced captions.
- Iterative editing in natural language: "make the title 2x bigger, swap to dark mode, add a fade-out at the end" — the agent edits the HTML.
Limitations & availability
This is a young project — 45 releases in the launch window and the current version is v0.4.12. Early GitHub issues show launch-week install friction, and open PRs are tracking HDR support, publish/share flows, and richer authoring pipelines.
Two concrete gaps worth knowing before you commit:
- No distributed cloud renderer. Today it is single-machine or Docker. For high-volume server rendering, Remotion Lambda still wins until HeyGen ships their own hosted option.
- Preview can stutter on expensive CSS (large
backdrop-filter, heavy shadow stacks). The final MP4 is unaffected because the render is seek-driven, not real-time — but it confuses people expecting preview fidelity to predict render fidelity.
Pricing is zero. Apache 2.0, commercial use at any scale, no per-render fees, no seat caps. Start with npx hyperframes init my-video or browse the catalog of 50+ blocks.
What's next
HDR and publish/share flows are already in the PR queue. The directional bet is clear: HeyGen is positioning Hyperframes as the deterministic compositing layer underneath its avatar and TTS stack, so expect the website-to-video pipeline to get more native integrations with HeyGen's audio and avatar APIs. For agent builders, the practical implication is simpler — video generation is no longer a GUI-bound task blocking autonomous pipelines.
Nguồn: github.com/heygen-com/hyperframes, hyperframes.heygen.com, official docs.

