Search dashcam footage by describing what happened
Gemini video embeddings for dashcam search when Google Photos already indexes footage.
Semantic search over videos using Gemini Embedding 2 or Qwen3-VL.
Native video embedding beats captioning pipelines — Gemini 2 does the heavy lifting.
Developers working with video footage, security applications, dashcam users
JinaAI · Firecrawl · Video understanding APIs
The interesting part: Gemini Embedding 2 projects raw mp4 video directly into the same vector space as text, no captioning or transcription pipeline. You embed 30-second video chunks as RETRIEVAL_DOCUMENT, embed a text query as RETRIEVAL_QUERY, and cosine similarity just works across modalities.
The tool splits footage into overlapping chunks, indexes them in a local ChromaDB instance, and auto-trims the top match from the original file via ffmpeg.
Cost is about $2.50/hr of footage to index, queries are negligible. Definitely room to optimize: skipping still frames, scene detection for smarter chunking, etc.
Gemini video embeddings for dashcam search when Google Photos already indexes footage.
Direct video-to-vector embedding skips transcription entirely—Twelve Labs but self-hosted.
Semantic search for saved knowledge, but RAG + embeddings compete with Obsidian, Logseq, Pinecone.
RAG over Epstein PDFs works offline, but sensationalism and crypto-tip jar hurt credibility.
12ms exact call-graph queries beat 1,400ms embeddings; 35 languages, zero cloud required.
Search XKCD by image or text description using Gemini multimodal embeddings.