AI Sees Me – CLIP running in the browser

Name: AI Sees Me – CLIP running in the browser
Availability: InStock
Author: jayyvk

by jayyvk·Mar 1, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●SolidWizardryBig Brain

CLIP embeddings live in the browser, but embedding visualizers already exist.

Strengths

•Getting CLIP to run at usable speeds in WASM while handling live video is a genuine constraint problem.
•Makes embedding-space similarity tangible through direct visual feedback—abstract concept made concrete.
•Zero server, fully local inference—no hidden API calls or data leakage.

Weaknesses

•Browser-based CLIP tools exist (ml5.js, fast.ai notebooks). This iteration doesn't claim a clear differentiation beyond the specific implementation.
•No interactive exploration UI beyond the basic text input—feels more like a demo than a tool you'd return to.

Post Description

I built a tool that runs OpenAI's CLIP model entirely in your browser using Transformers.js and ONNX Runtime Web. It encodes your webcam feed into vector embeddings and compares them against any text you type in real-time. No server, no API calls — all inference happens locally. The interesting technical challenge was getting CLIP to run at usable speeds in WASM while processing live video frames. Wanted to make the concept of embeddings and similarity scores tangible rather than abstract. Github: https://github.com/jayyvk/howaiseesme