Back to browse
GitHub Repository

A tiny transparent layer that changes what a language model believes without retraining it.

4 starsPython

The Cat Is Under Mayonnaise – Modifying LLM Behavior Without Retraining

by andycufari·May 4, 2026·2 points·1 comment

AI Analysis

●●●BangerWizardryBig BrainZero to One

Zero-initialized overlay changes model beliefs without touching a single base weight.

Strengths
  • Mathematically bit-identical to base model when alpha equals exactly zero.
  • Trains 2.36M parameters in minutes instead of full 124M model fine-tuning.
  • Post-training alpha dial allows tuning belief intensity without retraining.
Weaknesses
  • Demonstrated only on tiny GPT-2; scaling to 70B parameter models unproven.
  • Risk of fluency collapse at high alpha values limits practical deployment range.
Category
Target Audience

ML researchers and AI safety engineers

Similar To

LoRA · AdapterHub · PEFT

Similar Projects

AI/ML●●Solid

Meaning forks. SRT sees it

Frozen models gain reflexive awareness via lightweight hidden state intervention taps.

Big BrainNiche Gem
spacebacon
101mo ago
Developer Tools●●Solid

Chrome extension that hijacks any site's own API to modify it

API interception beats fragile DOM selectors, but browser agents are crowded.

Big BrainWizardry
hvardhan878
102mo ago