Back to browse
GitHub Repository

Official PyTorch implementation of NeuroFlow: EMA-Gated Temporal Sequence Compression for Vision Transformers. Achieves up to 55.8x wall-clock speedup for video inference via semantic surprise routing and a training-free Dual-Memory Reconstruction Protocol.

18 starsPython

NeuroFlow 55.8x video inference speedup for Vision Transformers PyTorch

by ynnk·May 26, 2026·8 points·2 comments

AI Analysis

●●SolidBig BrainWizardry

Training-free dual-memory protocol cuts 1792p SigLIP inference from 678ms to 11.9ms.

Strengths
  • Specific 55.8x wall-clock speedup with verifiable benchmark numbers and methodology
  • Training-free Architecture C retains 92.4% of dense accuracy at 84% token sparsity
  • Complete repo with paper, weights, verification scripts, and Hugging Face models
Weaknesses
  • Video ViT optimization is active research space with multiple competing approaches already
  • LLM ablation shows 0% token drift but limited scope beyond Phi-3-mini testing
Category
Target Audience

ML researchers and engineers working on video inference optimization

Similar To

Token Merging (ToMe) · DynamicViT · Sparse ViT approaches

Similar Projects

AI/ML●●Solid

PyTorch on Java

LibTorch bindings bring CUDA and MPS backends to Java with LLaMA-3 inference included.

Niche GemBig Brain
pdsminer
202d ago
AI/ML●●Solid

BNNR – a closed-loop pipeline for improving vision models

XAI-driven model improvement loop, but Weights & Biases already tracks experiments better.

Big BrainNiche Gem
dominka
102mo ago