docs(agentic): generate llms.txt and Markdown twins for AI agents#1276
Open
codyjlandstrom wants to merge 1 commit into
Open
docs(agentic): generate llms.txt and Markdown twins for AI agents#1276codyjlandstrom wants to merge 1 commit into
codyjlandstrom wants to merge 1 commit into
Conversation
Add @signalwire/docusaurus-plugin-llms-txt to emit an llms.txt index, an llms-full.txt corpus, and a per-page Markdown twin (append .md to any docs URL) so AI agents consume clean Markdown instead of scraping HTML. The plugin post-processes rendered HTML, so internal links resolve to URLs, variables.json substitutions are applied, and MDX components are flattened — none of which a source-reading generator handles. Scoped to the current version (includeVersionedDocs: false) to keep the corpus current and skip frozen snapshots. A small rehype plugin strips Docusaurus heading-anchor links (.hash-link) that would otherwise render as noisy "Direct link to" fragments. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Cody Landstrom <cody@okteto.com>
✅ Deploy Preview for okteto-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds
@signalwire/docusaurus-plugin-llms-txtso the docs build emits AI-agent-friendly Markdown alongside the normal HTML:llms.txt— index of all current-version pages (llmstxt.org standard)llms-full.txt— the full corpus in one file.mdto any docs URL (e.g./docs/get-started/install-okteto-cli.md)Why
Agents currently scrape rendered HTML (nav, scripts, styling) to read our docs. Serving clean Markdown gives them the content directly. This fits the ongoing agentic-docs direction.
Why this plugin
I spiked the plugin from the EkLine guide (
docusaurus-plugin-llms) first. Because it reads raw.mdxsource, it produced degraded output on our MDX-heavy docs:.mdxsource)<CodeBlock>,<Tabs>)${variables.*}SignalWire post-processes the rendered HTML, so links resolve to URLs,
variables.jsonsubstitutions are applied, and MDX components are flattened — none of which a source-reading generator handles.Notes
postBuildand only emits extra files; the served HTML pages are untouched.includeVersionedDocs: false) to keep the corpus current and skip frozen snapshots..hash-link) that would otherwise render as noisy "Direct link to …" fragments in the Markdown.:::tip) flatten to plain text in the Markdown twins only — content preserved, styling lost. Acceptable for agent consumption.Test plan
yarn buildsucceeds; plugin reports 155 documents processedyarn serve→/docs/llms.txt,/docs/llms-full.txt, and<any-page>.mdresolve🤖 Generated with Claude Code