Back to browse
GitHub Repository

H.E.I.M.D.A.L.L looks at fleet telemetry and gives you natural-language insights. GPU data loading (cuDF), local LLM inference (Gemma 2), and production NIM on GKE. Open the notebooks, run cells, get answers! Quick start should not take longer than 10 minutes and the T4 path is completely free!

18 starsJupyter Notebook

H.e.i.m.d.a.l.l – Telemetry-to-insight pipeline for fleet telemetry

by starksriram·Feb 18, 2026·1 point·0 comments

AI Analysis

MidNiche GemWizardry
The Take

Combines GPU-first data ingest (cuDF + UVM) with format-aware inference choices (GGUF for local Gemma 2, TensorRT for production) and ships three runnable notebooks — including a full NIM-on-GKE deployment — so you can benchmark pandas vs cuDF and walk through local-to-cloud inference. Clever and practical for teams that actually need to scale telemetry queries, but expect non-trivial ops work, vendor lock-in to NVIDIA/GCP tooling, and cloud costs to reproduce the full stack.

Category
Target Audience

Data engineers, ML/robotics engineers, and SREs working with large fleet telemetry who want GPU-accelerated analytics and LLM-driven querying

Post Description

I built a pipeline for querying fleet telemetry (AVs, robots, vehicles) in natural language. Load Parquet or CSV into cuDF, then ask things like "Which vehicles exceeded 120 km/h in region X?" and get back IDs and metrics. Tech: cuDF for GPU ingest/analytics, NVIDIA NIM on GKE for LLM inference, format-aware model selection (GGUF local, TensorRT prod). Three notebooks: data ingest with pandas vs cuDF vs cudf.pandas benchmarks, local Gemma 2 inference, and full NIM deployment. Runs on Colab with a T4 for notebooks 1 and 2; notebook 3 uses GCP and NIM on GKE.

Similar Projects

Security●●Solid

GPU-accelerated search for Bitcoin keys generated with weak entropy

This reads like a GPU engineer's field notes — one ~3,400-line CUDA file implements a full per-thread crypto pipeline (key gen → EC multiply → SHA-256 → RIPEMD-160) and a two-stage bloom+binary-search matcher to check ~3,100 targets at ~100M keys per batch. The article digs into concrete low-level choices (LUT layout, memory hierarchy, __ldg reads, atomicCAS reporting, and per-mode keygen strategies), which is rare in public writeups; downside is it's closed-source and the dual-use/ethical implications should be called out more explicitly.

WizardryNiche Gem
orkblutt
213mo ago