Digest AI vs HN About

Fine-tuned 3B outperforms Claude Haiku on constrained generation

Fine-tuned 3B outperforms Claude Haiku on constrained generation

by serendip-ml·Mar 13, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainNiche Gem

Fine-tuned 3B Qwen matches Haiku on jokes, validating small models for constrained agent tasks.

Strengths

•Concrete benchmark data comparing 0.5B-72B Qwen models against Claude Haiku 4.5.
•Details SFT and DPO pipeline using 3000 samples, reproducible for other tasks.
•Challenges RAG-heavy agent architecture by showing trained adapters outperform prompts.

Weaknesses

•No model weights or code repository linked for immediate implementation or testing.
•Joke telling is a narrow benchmark; results may not generalize to complex reasoning.

Category

Target Audience

ML Engineers, AI Agent Developers

Similar To

HuggingFace · Axolotl · Unsloth

Similar Projects

AI/ML●●Solid

Flint – A 30B model fine-tuned for less repetition

Fine-tuned Qwen 30B that prioritizes output diversity over convergent accuracy.

Niche GemSolve My Problem

thmsmxwll

622mo ago

AI/ML●●Solid

I fine-tuned Qwen 3.5 (0.8B–4B) on a Mac for text-to-SQL – 2B beats 12B

Unified memory trick lets a 2B model beat 12B; trains on MacBook with zero cloud costs.

Ship ItNiche GemBig Brain

sciences44

713mo ago

AI/ML●Mid

100% LLM accuracy–no fine-tuning, JSON only

Ancient Rome Q&A benchmark shows 81pp accuracy lift, but lacks adversarial defense evidence.

Big Brain

MysticBirdie

223mo ago

AI/ML●Mid

I fine-tuned Gemma 4 to talk like a pirate

Cool demo, but there's no actual tool to use—just a video and writeup.

Ship It

logicallee

302mo ago

AI/ML●●Solid

Shard-based scheduling for 100x more fine-tuning experiments on 4 GPUs

Shard-based scheduling cuts GPU wait time, though Ray Tune offers similar early stopping.

Big BrainSolve My Problem

kamranrapidfire

102mo ago

AI/ML●Mid

OpenAI CLIP fine tuned on Galaxy morphology

Galaxy classification model, but model card has mostly empty fields.

Niche Gem

mjupp1

102mo ago