GitHub Repository

Official PyTorch implementation of NeuroFlow: EMA-Gated Temporal Sequence Compression for Vision Transformers. Achieves up to 55.8x wall-clock speedup for video inference via semantic surprise routing and a training-free Dual-Memory Reconstruction Protocol.

18 starsPython

NeuroFlow 55.8x video inference speedup for Vision Transformers PyTorch

Name: NeuroFlow 55.8x video inference speedup for Vision Transformers PyTorch
Availability: InStock
Author: ynnk

by ynnk·May 26, 2026·8 points·2 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainWizardry

Training-free dual-memory protocol cuts 1792p SigLIP inference from 678ms to 11.9ms.

Strengths

•Specific 55.8x wall-clock speedup with verifiable benchmark numbers and methodology
•Training-free Architecture C retains 92.4% of dense accuracy at 84% token sparsity
•Complete repo with paper, weights, verification scripts, and Hugging Face models

Weaknesses

•Video ViT optimization is active research space with multiple competing approaches already
•LLM ablation shows 0% token drift but limited scope beyond Phi-3-mini testing