Back to browse
GitHub Repository

Post-training Qwen2.5-0.5B-Instruct to talk like GenZ

5 starsJupyter Notebook

LLM post-training to speak like GenZ, costing less than a cup of coffee

by aidarbek·May 11, 2026·5 points·1 comment

AI Analysis

MidShip It

Spent two dollars to teach a model to say 'vibes' instead of 'feelings'.

Strengths
  • Demonstrates full SFT plus GRPO reinforcement learning pipeline for under $2.
  • Synthetic data generation strategy reduces manual labeling overhead significantly.
Weaknesses
  • Frontier models already mimic slang better without needing a dedicated fine-tune.
  • Reward function based on keyword counting is too simplistic for true style transfer.
Category
Target Audience

ML learners experimenting with post-training techniques

Similar To

Hugging Face · Axolotl

Similar Projects

AI/ML●●●Banger

MaximusLLM – Train 262k-vocab LLMs on a single 16GB GPU

Ghost Logit math bypasses 262k vocab OOM without materializing full matrices.

Big BrainWizardryZero to One
yousef_g
202mo ago