Extracted tech from 5.6M sites (plus versions) and made some dashboards

Name: Extracted tech from 5.6M sites (plus versions) and made some dashboards
Availability: InStock
Author: _chse_

by _chse_·Mar 5, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●MidEye Candy

Crawled 5.6M sites, but tech stack distribution dashboards already exist (BuiltWith, Wappalyzer).

Strengths

•Massive fresh dataset (5.6M domains) with granular version-level breakdowns useful for decision-making.
•Clean dashboard UI makes tech distribution visually intuitive and export-friendly.

Weaknesses

•BuiltWith and Wappalyzer have solved this problem at scale with richer filtering and company linkage.
•No novel analysis, filtering, or comparison tools—purely presentational dashboards over raw data.

Similar Projects

Other●●Solid

The Crawl Times – Newspaper-style front pages for tech news sites

Cloudflare /crawl API powers nostalgic newspaper layouts for Hacker News and friends.

Eye CandyCozy

g_br_l

502mo ago

Developer Tools●●Solid

CLI for crawling documentation sites into Markdown with defuddle

No-browser docs crawler using defuddle when Firecrawl and JinaAI already exist.

Ship ItSolve My Problem

nistuley

5010d ago

Design●●Solid

Design Memory – Extract design systems from live websites via CLI

Playwright-driven crawling + deterministic token extraction plus an LLM for semantic labeling is a clever pipeline — it doesn’t just scrape CSS, it produces an AI-optimized .design-memory folder with tokens, component recipes, and multi-page merge/diff capabilities. Expect variable fidelity on highly dynamic or framework-heavy sites since the approach depends on selector heuristics and an API key, but the CLI commands (learn, install, diff) and docs show this is more than a research sketch.

WizardryNiche Gem

saleban1031

103mo ago

Developer Tools●●Solid

Indxel – Your build should fail on broken meta tags

It actually treats meta and OG tags like first-class testable artifacts: one command scaffolds seo.config.ts, sitemap and OG routes, another crawls and scores pages, and indxel check --ci will fail your build on regressions — plus deploy diffs so you see exactly what changed. The combo of Type-safe config, auto-generated JSON-LD, auto-indexation retries and an AI audit option (Claude) feels thoughtful and developer-oriented; this isn't just prettier Lighthouse output, it's CI-friendly SEO infra.

Solve My ProblemSlick

YannBuilds

104mo ago

Social●Mid

A Hacker News–style site focused on European tech

HN mirror for Europe, but aggregation without new insights doesn't create stickiness.

Dark Horse

davedx

523mo ago

SaaS●Mid

Add a knowledge chat widget to your static site with one script tag

Docs chatbot for static sites—but Mendable, Librarian, and ChatBot already own this space.

Crowd PleaserShip It

shivaodin

103mo ago