fayeed.dev / writingFeb 22, 2025

Building next-agent-md — Markdown for AI Agents in Next.js

by Fayeed Pawaskar·8 min read

Share

Twitter / X ↗LinkedIn ↗Facebook ↗

A few weeks ago, Cloudflare published a feature called Markdown for Agents.

The idea is simple: if an AI agent requests a page with Accept: text/markdown, return Markdown instead of HTML.

That matters because token cost drops a lot. In Cloudflare's example, the same page went from 16,180 tokens (HTML) to 3,150 tokens (Markdown), roughly an 80% reduction.

The catch: their solution is Cloudflare-specific. I wanted the same behavior for any Next.js app, regardless of infra, so I built next-agent-md.

This post is the architecture story behind it: what I built, why I chose this shape, and which tradeoffs I accepted.

GitHub: fayeed/next-agent-md

How it works

The core mechanism is HTTP content negotiation via the Accept header.

If a request includes Accept: text/markdown, middleware intercepts it and returns Markdown.

The first design choice was where to put this behavior. I chose middleware instead of route handlers because I wanted one integration point for the entire app. In practice, that keeps adoption dead simple: install package, add middleware once, and every page can serve markdown to agents.

Request flow

Agent sends GET /about with Accept: text/markdown
Middleware detects markdown intent
Middleware self-fetches the same URL with Accept: text/html
HTML boilerplate (nav/footer/scripts/etc.) is stripped
Clean HTML is converted to Markdown
Response returns as text/markdown with token metadata

I intentionally convert from already-rendered HTML instead of trying to reconstruct content from React internals. That decision keeps the package router-agnostic (App Router and Pages Router) and makes it resilient to how the app is authored. It also keeps the architecture honest: Next.js already knows how to render your page correctly, so next-agent-md should transform that output, not reimplement rendering logic.

Core middleware (simplified)

import { NextRequest, NextResponse } from "next/server"
import { NodeHtmlMarkdown } from "node-html-markdown"

const SKIP_HEADER = "x-markdown-skip"

function wantsMarkdown(req: NextRequest) {
  return req.headers.get("accept")?.includes("text/markdown")
}

async function fetchHtml(req: NextRequest) {
  const headers = new Headers(req.headers)
  headers.set("accept", "text/html")
  headers.set(SKIP_HEADER, "1")

  const res = await fetch(req.url, {
    method: "GET",
    headers,
  })

  return res.text()
}

export async function middleware(req: NextRequest) {
  if (req.headers.get(SKIP_HEADER) === "1") return NextResponse.next()
  if (!wantsMarkdown(req)) return NextResponse.next()

  const html = await fetchHtml(req)
  const stripped = stripBoilerplate(html)
  const markdown = NodeHtmlMarkdown.translate(stripped)

  return new NextResponse(markdown, {
    headers: {
      "content-type": "text/markdown; charset=utf-8",
      "x-markdown-tokens": String(estimateTokens(markdown)),
    },
  })
}

Loop prevention

Self-fetching the same URL can recurse forever unless you guard it.

I add an internal header (x-markdown-skip: 1) on internal fetches and short-circuit middleware when it exists:

if (request.headers.get("x-markdown-skip") === "1") {
  return NextResponse.next()
}

And in internal fetch:

headers.set("x-markdown-skip", "1")
headers.set("accept", "text/html")

This gives a safe two-pass flow:

external request asks for markdown
internal request asks for html with skip header

I preferred an explicit skip header over heuristic checks because it is deterministic and easy to debug in logs when middleware chains grow. This was a reliability decision more than an implementation detail. Middleware stacks evolve over time, and explicit control signals age better than implicit checks.

Middleware API

The package exports withMarkdownForAgents() so you can run standalone or wrap existing middleware.

// proxy.ts (Next.js 16+) or middleware.ts (Next.js <= 15)
import { withMarkdownForAgents } from "next-agent-md"

export default withMarkdownForAgents()

export const config = {
  matcher: ["/((?!_next/static|_next/image|favicon.ico).*)"],
}

You can also compose with your own middleware:

import { withMarkdownForAgents } from "next-agent-md"
import { myAuthMiddleware } from "./lib/auth"

export default withMarkdownForAgents(myAuthMiddleware, {
  contentSignal: { aiTrain: true, search: true, aiInput: false },
})

The contentSignal option emits a Content-Signal header for AI policy hints. I exposed this as a wrapper API because most production apps already have auth, rewrites, or locale middleware. The goal was composition, not replacement.

Stripping boilerplate

Markdown is useful not just because it is shorter, but because it removes noise.

Agents do not need:

nav bars
footers
sidebars
scripts/styles

Because this runs on Edge, I avoided jsdom and used a lightweight regex strategy.

const DEFAULT_STRIP_TAGS = [
  "nav",
  "header",
  "footer",
  "aside",
  "script",
  "style",
  "noscript",
  "iframe",
  "svg",
]

I also strip landmark roles like navigation/banner and HTML comments.

I considered DOM-based parsing, but Edge constraints pushed me toward smaller, browser-compatible code. So I chose a pragmatic middle ground: regex + role-based stripping. It is less theoretically perfect than a full parser, but much better for runtime footprint and cold-start behavior.

Pre-building static pages

Self-fetch + convert is fine for dynamic pages, but wasteful for static pages.

So I added a build step:

npx next-agent-md build

This command is intentionally designed to run after next build. At that point Next.js has already materialized static HTML, so next-agent-md build can do pure transformation work with zero runtime ambiguity.

Under the hood, the command does three things:

reads .next/prerender-manifest.json to discover which routes are truly static
maps each route to its generated HTML file inside .next/server/...
converts those files to markdown and writes them to public/.well-known/markdown/

That output path matters because it is deployable static content, so your CDN can serve it directly without invoking any markdown conversion logic at request time.

After next build, it reads .next/prerender-manifest.json, finds static routes, converts their HTML once, and writes markdown to:

public/.well-known/markdown/
  index.md
  about.md
  blog/
    hello-world.md
    getting-started.md

Fast path at runtime

const prebuilt = await fetchPrebuiltMarkdown(request)
if (prebuilt) {
  return new Response(prebuilt, {
    headers: {
      "content-type": "text/markdown; charset=utf-8",
      "x-markdown-tokens": String(estimateTokens(prebuilt)),
      "x-markdown-source": "prebuilt",
    },
  })
}

const html = await fetchPageHtml(request, skipHeader)

This split (prebuilt for static, self-fetch for dynamic) keeps runtime overhead low where it can be low, without dropping coverage for dynamic routes. Architecturally, this became a two-lane system:

build-time lane for static routes (cheap at runtime)
request-time lane for dynamic routes (always correct)

That balance gave me good latency without sacrificing correctness.

It also made operations simpler: dynamic routes still work automatically, while static routes become precomputed artifacts you can inspect, diff, and debug in CI.

Wire it into build:

{
  "scripts": {
    "build": "next build && next-agent-md build"
  }
}

Token estimation

Each markdown response includes x-markdown-tokens so agents can budget context.

I used a practical approximation (~1 token per 4 chars) because tiktoken + WASM is not Edge-friendly in this setup.

export function estimateTokens(text: string): number {
  return Math.ceil(text.length / 4)
}

This was another deliberate tradeoff: I wanted predictable, low-cost metadata over exact token accounting. The estimate is close enough for context budgeting, and cheap enough to compute on every request.

CLI setup

Setup is intentionally minimal:

npm install next-agent-md
npx next-agent-md init

init will:

detect Next.js version
create proxy.ts for Next.js 16+ (middleware.ts for older versions)
avoid overwriting existing middleware
print test curl commands

I spent extra time on init because integration friction usually kills small infrastructure tools. If setup is not one command, most people will never try it.

Example output:

✔ Created proxy.ts

AI agents can now request Markdown from any page:
  curl -H "Accept: text/markdown" http://localhost:3000/

Test it now

You can test a live endpoint directly with curl:

curl -si -H "Accept: text/markdown" https://fayeed.dev

If you’re testing locally, or on your own domain:

curl -si -H "Accept: text/markdown" http://localhost:3000/

Look for:

content-type: text/markdown
x-markdown-tokens
x-markdown-source (when prebuilt markdown is used)

Edge runtime constraints

At request-time (Edge Runtime), no Node APIs:

no fs
no child_process
no Node-only libs

So runtime conversion code is Edge-safe, while build tooling (next-agent-md build) runs in Node and can use fs freely.

Separating Edge-time code from Node-time code became a core architectural boundary. Once I made that explicit in package exports, the implementation got much cleaner.

{
  "exports": {
    ".": { "import": "./dist/edge.js" },
    "./config": { "import": "./dist/node-config.js" }
  }
}

What’s next

A few improvements I still want:

Streaming conversion to reduce TTFB for large pages
Better caching controls (Cache-Control, CDN behavior)
More robust stripping via lightweight streaming HTML parsing

If you want to try it:

GitHub: fayeed/next-agent-md
npm install + init:

npm install next-agent-md
npx next-agent-md init

Thanks for reading. If you test this in production, I’d love feedback on edge cases.

Share:Twitter ↗LinkedIn ↗