Back to browse
GitHub Repository

Plug-and-play reward monitoring for RL training loops. Catch reward hacking, component imbalance, and starvation before they tank your run. Drop in one .step() call — get balance reports, auto weight correction, alignment scores, and WandB/TensorBoard/SB3 integrations out of the box. → rewardguard.dev

5 starsPython

RewardGuard – detect reward hacking in RL training loops

by Giovan321·Apr 26, 2026·1 point·1 comment

AI Analysis

●●SolidNiche GemBig Brain

Catches reward hacking before it tanks your RL training run.

Strengths
  • Targets reward hacking specifically, a genuinely hard RL debugging problem
  • Integrates with existing RL stack: WandB, TensorBoard, Stable Baselines3
  • Clear output format with actionable weight adjustment recommendations
Weaknesses
  • Premium auto-adjustment features are private, can't evaluate full value
  • Very early stage: 2 stars, zero issues or pull requests on GitHub
Category
Target Audience

Reinforcement learning engineers and ML researchers

Similar To

Weights & Biases · TensorBoard · MLflow

Similar Projects

EducationMid

rlvrbook

Educational content in a space where Nathan Lambert's RLHF book already exists.

Niche Gem
kyars
111mo ago