Back to browse
GitHub Repository

Lightweight CLI for crawling documentation sites into Markdown with defuddle

3 starsTypeScript

CLI for crawling documentation sites into Markdown with defuddle

by nistuley·Jun 15, 2026·2 points·0 comments

AI Analysis

●●SolidSolve My ProblemNiche Gem

No-browser doc crawler when JinaAI and Firecrawl already dominate this space.

Strengths
  • Defuddle integration extracts clean content without full browser overhead
  • CLI flags for depth, concurrency, and glob patterns give real control
  • Single-file merge option useful for RAG context bundling
Weaknesses
  • Cannot crawl client-rendered sites that require JavaScript execution
  • Docs-to-markdown for LLMs is a crowded category with established players
Target Audience

Developers building RAG pipelines or LLM context datasets

Similar To

JinaAI · Firecrawl · Crawlee

Post Description

docrawl is a lightweight Node.js CLI for crawling documentation sites and converting them into Markdown with defuddle.

It is built for static and server-rendered docs sites such as Docusaurus, VitePress, MkDocs, GitBook exports, and Obsidian Publish. It does not run a browser and does not execute page JavaScript.

Similar Projects