SnapAPI – Screenshot/PDF/Extract API Built with Fastify and Playwright

Name: SnapAPI – Screenshot/PDF/Extract API Built with Fastify and Playwright
Availability: InStock
Author: Sleywill

by Sleywill·Feb 19, 2026·2 points·1 comment

Visit Project View on HN

AI Analysis

●●SolidSlickSolve My Problem

The Take

SnapAPI packs screenshots, PDFs, video capture and structured extraction into one Fastify+Playwright endpoint, with practical features like ad/cookie blocking, element selectors and a markdown/article extraction backed by Readability. The real selling point is operational: handling browser contexts, crash recovery and fair billing so teams don't fight Playwright in production. Useful and pragmatic — but it's entering a crowded market where uptime, latency and pricing will decide whether it matters.

Post Description

Hi HN,

I'm a solo developer based in Tbilisi, Georgia, and I've been building SnapAPI for the past few months. It's a REST API that does four things: screenshots, PDFs, video recording, and structured data extraction from web pages.

*Why I built it:* I was working on a project that needed link previews. I tried self-hosting Playwright — it works, but managing browser contexts, memory leaks, cookie banners, and crash recovery is a surprising amount of operational work for what should be a simple task. The existing APIs (ScreenshotOne, Urlbox) are solid but expensive and fragmented — you need separate services for screenshots vs. data extraction.

*Architecture:*

- *Fastify* handles the API layer. Chose it over Express for the schema validation (I use it as the single source of truth for request validation) and the ~2x throughput difference under load. - *Playwright* over Puppeteer. The cross-browser support doesn't matter for this use case, but Playwright's `browser.newContext()` isolation and the built-in auto-wait are more reliable for capturing pages at the right moment. - *BullMQ + Redis* for job queuing. Screenshot requests go into a queue, a separate worker process picks them up. This decouples the API from the heavy rendering work and lets me set per-request timeouts without blocking the event loop. - *LRU cache* (in-memory + Redis) with content-hash keys. Same URL + same options = cache hit. This alone handles ~40% of requests on busy days. - *S3-compatible storage* for result delivery. Screenshots get uploaded to object storage, the API returns a pre-signed URL. Files auto-expire after 24h. - *Cookie banner blocking* uses @ghostery/adblocker-playwright with custom filter lists. It's not perfect (nothing is), but it catches ~90% of GDPR popups. For the rest, I inject custom JS to detect and dismiss common consent frameworks (OneTrust, CookieBot, etc.).

The whole thing runs on a single 4-core/8GB VPS in Amsterdam. Playwright is the bottleneck — each browser context uses ~80-120MB — so I cap concurrent renders at 6 and queue the rest. Under normal load (~500-1000 requests/day currently), p95 latency is around 3 seconds for a full-page screenshot.

*What's different from existing services:*

- Combined screenshot + PDF + video + data extraction in one API (most services only do screenshots) - AVIF output format (50-80% smaller than PNG, not widely supported elsewhere) - `extract` endpoint that returns clean markdown, plain text, or structured metadata — useful for RAG/LLM preprocessing pipelines - Starts at $9/month instead of $29-99/month

*What I'm still figuring out:*

- Video recording (MP4 of a browsing session) is the newest feature and the least optimized. FFmpeg encoding after capture adds 2-5 seconds of overhead. - Scaling beyond one server. Playwright doesn't cluster well — I'll probably need to go multi-server with a load balancer routing to render nodes. Haven't needed it yet. - Whether the $9 price point is sustainable. The compute cost per screenshot is ~$0.001, but there's a lot of variance depending on page complexity.

Free tier: 100 requests/month, no credit card. I'm not trying to bait-and-switch — the free tier is genuinely useful for side projects and testing.

Docs: https://snapapi.pics/docs.html

Happy to discuss the architecture, Playwright quirks, or the economics of running a screenshot API. I've learned a lot about browser automation edge cases that I didn't expect going in.

Similar Projects

Developer Tools●Mid

SnapAPI – Screenshot, metadata extraction, and PDF generation API

Yet another screenshot API competing with ApiFlash and ScreenshotAPI.

Ship It

msmolkin

1023d ago

Developer Tools●Mid

AI-Powered Web Automation APIs (Screenshot, Scrape, SEO, PDF)

Packages common web automation tasks — screenshots, scrapes, SEO checks and PDFs — into APIs, which is convenient but very crowded territory. The live share is broken (the page shows 'zrok share ... not found'), so you can't test reliability or AI value‑adds; unless it provides robust semantic SEO insights, evasion/anti-bot handling, or superior extraction accuracy, it's another Puppeteer/Playwright wrapper.

Ship ItNiche Gem

openclaw_ai

203mo ago

Developer Tools●●Solid