Back to browse
I indexed 37h of my videos using an RTX 4090 and local ML models in 24h

I indexed 37h of my videos using an RTX 4090 and local ML models in 24h

by iliashad·Jun 30, 2026·3 points·1 comment

AI Analysis

●●SolidWizardryBig Brain

RTX 4090 crushes M1 Max for local video indexing with six ML plugins running in parallel.

Strengths
  • Plugin architecture (DescriptorPlugin, FaceRecognitionPlugin, TextDetectionPlugin) enables modular video analysis.
  • Detailed per-frame timing metrics show genuine engineering depth and benchmarking rigor.
  • Self-hosted Docker deployment gives full control versus cloud video analysis APIs.
Weaknesses
  • Submitting benchmark JSON instead of the actual tool limits what others can evaluate or use.
  • Video indexing space has established players (Twelve Labs, VideoDB) with more complete offerings.
Category
Target Audience

Developers building video analysis or content indexing systems

Similar To

Twelve Labs · VideoDB · Videoblocks

Post Description

TLDR: Following my recent blog post and Hacker News post (https://news.ycombinator.com/item?id=48528029). where I ran the desktop app on my M1 Max. This time, I’m using the self-hosted version, running in Docker, with an NVIDIA RTX 4090 (24 GB of VRAM).

The content is also fundamentally more demanding: long podcast episodes with at least two faces in every frame, coding tutorials packed with on-screen text, and screen recordings. GoPro footage is mostly wide outdoor shots.

But NVIDIA was much faster than my M1 Max.

The longest video was a livestream of 3h 12m indexed in 1h 52m (4,612 frames analyzed).

You can directly see the processing jobs results in JSON format here: https://gist.github.com/IliasHad/fd64e4d331e90e57d61e95f64e8...

Similar Projects