Building next-agent-md — Markdown for AI Agents in Next.js

A few weeks ago, Cloudflare published a feature called Markdown for Agents.
The idea is simple: if an AI agent requests a page with Accept: text/markdown, return Markdown instead of HTML.
That matters because token cost drops a lot. In Cloudflare's example, the same page went from 16,180 tokens (HTML) to 3,150 tokens (Markdown), roughly an 80% reduction.
The catch: their solution is Cloudflare-specific. I wanted the same behavior for any Next.js app, regardless of infra, so I built next-agent-md.
This post is the architecture story behind it: what I built, why I chose this shape, and which tradeoffs I accepted.
GitHub: fayeed/next-agent-md
How it works
The core mechanism is HTTP content negotiation via the Accept header.
If a request includes Accept: text/markdown, middleware intercepts it and returns Markdown.
The first design choice was where to put this behavior. I chose middleware instead of route handlers because I wanted one integration point for the entire app. In practice, that keeps adoption dead simple: install package, add middleware once, and every page can serve markdown to agents.
Request flow
- Agent sends
GET /aboutwithAccept: text/markdown - Middleware detects markdown intent
- Middleware self-fetches the same URL with
Accept: text/html - HTML boilerplate (nav/footer/scripts/etc.) is stripped
- Clean HTML is converted to Markdown
- Response returns as
text/markdownwith token metadata
I intentionally convert from already-rendered HTML instead of trying to reconstruct content from React internals. That decision keeps the package router-agnostic (App Router and Pages Router) and makes it resilient to how the app is authored.
It also keeps the architecture honest: Next.js already knows how to render your page correctly, so next-agent-md should transform that output, not reimplement rendering logic.
Core middleware (simplified)
import { NextRequest, NextResponse } from "next/server" import { NodeHtmlMarkdown } from "node-html-markdown" const SKIP_HEADER = "x-markdown-skip" function wantsMarkdown(req: NextRequest) { return req.headers.get("accept")?.includes("text/markdown") } async function fetchHtml(req: NextRequest) { const headers = new Headers(req.headers) headers.set("accept", "text/html") headers.set(SKIP_HEADER, "1") const res = await fetch(req.url, { method: "GET", headers, }) return res.text() } export async function middleware(req: NextRequest) { if (req.headers.get(SKIP_HEADER) === "1") return NextResponse.next() if (!wantsMarkdown(req)) return NextResponse.next() const html = await fetchHtml(req) const stripped = stripBoilerplate(html) const markdown = NodeHtmlMarkdown.translate(stripped) return new NextResponse(markdown, { headers: { "content-type": "text/markdown; charset=utf-8", "x-markdown-tokens": String(estimateTokens(markdown)), }, }) }
Loop prevention
Self-fetching the same URL can recurse forever unless you guard it.
I add an internal header (x-markdown-skip: 1) on internal fetches and short-circuit middleware when it exists:
if (request.headers.get("x-markdown-skip") === "1") { return NextResponse.next() }
And in internal fetch:
headers.set("x-markdown-skip", "1") headers.set("accept", "text/html")
This gives a safe two-pass flow:
- external request asks for markdown
- internal request asks for html with skip header
I preferred an explicit skip header over heuristic checks because it is deterministic and easy to debug in logs when middleware chains grow. This was a reliability decision more than an implementation detail. Middleware stacks evolve over time, and explicit control signals age better than implicit checks.
Middleware API
The package exports withMarkdownForAgents() so you can run standalone or wrap existing middleware.
// proxy.ts (Next.js 16+) or middleware.ts (Next.js <= 15) import { withMarkdownForAgents } from "next-agent-md" export default withMarkdownForAgents() export const config = { matcher: ["/((?!_next/static|_next/image|favicon.ico).*)"], }
You can also compose with your own middleware:
import { withMarkdownForAgents } from "next-agent-md" import { myAuthMiddleware } from "./lib/auth" export default withMarkdownForAgents(myAuthMiddleware, { contentSignal: { aiTrain: true, search: true, aiInput: false }, })
The contentSignal option emits a Content-Signal header for AI policy hints.
I exposed this as a wrapper API because most production apps already have auth, rewrites, or locale middleware. The goal was composition, not replacement.
Stripping boilerplate
Markdown is useful not just because it is shorter, but because it removes noise.
Agents do not need:
- nav bars
- footers
- sidebars
- scripts/styles
Because this runs on Edge, I avoided jsdom and used a lightweight regex strategy.
const DEFAULT_STRIP_TAGS = [ "nav", "header", "footer", "aside", "script", "style", "noscript", "iframe", "svg", ]
I also strip landmark roles like navigation/banner and HTML comments.
I considered DOM-based parsing, but Edge constraints pushed me toward smaller, browser-compatible code. So I chose a pragmatic middle ground: regex + role-based stripping. It is less theoretically perfect than a full parser, but much better for runtime footprint and cold-start behavior.
Pre-building static pages
Self-fetch + convert is fine for dynamic pages, but wasteful for static pages.
So I added a build step:
npx next-agent-md build
This command is intentionally designed to run after next build. At that point Next.js has already materialized static HTML, so next-agent-md build can do pure transformation work with zero runtime ambiguity.
Under the hood, the command does three things:
- reads
.next/prerender-manifest.jsonto discover which routes are truly static - maps each route to its generated HTML file inside
.next/server/... - converts those files to markdown and writes them to
public/.well-known/markdown/
That output path matters because it is deployable static content, so your CDN can serve it directly without invoking any markdown conversion logic at request time.
After next build, it reads .next/prerender-manifest.json, finds static routes, converts their HTML once, and writes markdown to:
public/.well-known/markdown/ index.md about.md blog/ hello-world.md getting-started.md
Fast path at runtime
const prebuilt = await fetchPrebuiltMarkdown(request) if (prebuilt) { return new Response(prebuilt, { headers: { "content-type": "text/markdown; charset=utf-8", "x-markdown-tokens": String(estimateTokens(prebuilt)), "x-markdown-source": "prebuilt", }, }) } const html = await fetchPageHtml(request, skipHeader)
This split (prebuilt for static, self-fetch for dynamic) keeps runtime overhead low where it can be low, without dropping coverage for dynamic routes. Architecturally, this became a two-lane system:
- build-time lane for static routes (cheap at runtime)
- request-time lane for dynamic routes (always correct)
That balance gave me good latency without sacrificing correctness.
It also made operations simpler: dynamic routes still work automatically, while static routes become precomputed artifacts you can inspect, diff, and debug in CI.
Wire it into build:
{ "scripts": { "build": "next build && next-agent-md build" } }
Token estimation
Each markdown response includes x-markdown-tokens so agents can budget context.
I used a practical approximation (~1 token per 4 chars) because tiktoken + WASM is not Edge-friendly in this setup.
export function estimateTokens(text: string): number { return Math.ceil(text.length / 4) }
This was another deliberate tradeoff: I wanted predictable, low-cost metadata over exact token accounting. The estimate is close enough for context budgeting, and cheap enough to compute on every request.
CLI setup
Setup is intentionally minimal:
npm install next-agent-md npx next-agent-md init
init will:
- detect Next.js version
- create
proxy.tsfor Next.js 16+ (middleware.tsfor older versions) - avoid overwriting existing middleware
- print test curl commands
I spent extra time on init because integration friction usually kills small infrastructure tools. If setup is not one command, most people will never try it.
Example output:
✔ Created proxy.ts AI agents can now request Markdown from any page: curl -H "Accept: text/markdown" http://localhost:3000/
Test it now
You can test a live endpoint directly with curl:
curl -si -H "Accept: text/markdown" https://fayeed.dev
If you’re testing locally, or on your own domain:
curl -si -H "Accept: text/markdown" http://localhost:3000/
Look for:
content-type: text/markdownx-markdown-tokensx-markdown-source(when prebuilt markdown is used)
Edge runtime constraints
At request-time (Edge Runtime), no Node APIs:
- no
fs - no
child_process - no Node-only libs
So runtime conversion code is Edge-safe, while build tooling (next-agent-md build) runs in Node and can use fs freely.
Separating Edge-time code from Node-time code became a core architectural boundary. Once I made that explicit in package exports, the implementation got much cleaner.
{ "exports": { ".": { "import": "./dist/edge.js" }, "./config": { "import": "./dist/node-config.js" } } }
What’s next
A few improvements I still want:
- Streaming conversion to reduce TTFB for large pages
- Better caching controls (
Cache-Control, CDN behavior) - More robust stripping via lightweight streaming HTML parsing
If you want to try it:
- GitHub: fayeed/next-agent-md
- npm install + init:
npm install next-agent-md npx next-agent-md init
Thanks for reading. If you test this in production, I’d love feedback on edge cases.