GitHub Repository

Graph-Oriented Generation (GOG)

64 starsPython

Experiments Mapping the "Primitive Layer" in Language Models

Name: Experiments Mapping the "Primitive Layer" in Language Models
Availability: InStock
Author: dchisholm125

by dchisholm125·Mar 15, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●●●●GemWizardryBig BrainZero to One

Semantic primitives show up in activation patterns across Qwen, Gemma, LLaMA, SmolLM2.

Strengths

•18 experiments across 4 architectures with cross-validated results.
•Layer 0a/0b distinction is architecture-independent with +0.245 activation gap.
•Primitive composition produces predictable Layer 1 concepts in 3/4 models.

Weaknesses

•Research-focused — not a product developers can immediately use.
•Small models only (360M to 1B) — larger model behavior untested.

Post Description

I spent months running experiments on what language models do when they encounter inputs outside their training distribution — random phonemes, invented morphemes, Wierzbicka's semantic primitives.

The finding that surprised me: Language model behavior follows a reproducible taxonomy (Synthesis, Collapse, Overflow, Metacognition, Linguistic Drift). It's not random noise — it's classifiable.

The finding that matters for interpretability: Structure is a more reliable control variable than content. Telling a model how to structure reasoning produces consistent outputs. Telling it what to reason about doesn't.

The finding that might matter most: Wierzbicka's semantic primitives (WANT, KNOW, FEEL, TIME, etc.) appear as measurable activation patterns in small language models across four different architectures — Qwen, Gemma, LLaMA, and SmolLM2.

18 experiments. 4 architectures. Cross-validated. Real data.

Full paper, experiment code, and primitive vocabulary JSON: https://github.com/dchisholm125/graph-oriented-generation

The primitive layer is waiting to be mapped.

Similar Projects

Education○Pass

Learn language with word-by-word translation maps [video]

YouTube demo with no working product or code to evaluate.

julienreszka

2011d ago

AI/ML●Mid

SFT to convert a base language model into a conversational chat model

Tutorial code for SFT pipeline, but dozens of identical examples exist on GitHub.

Ship It

onurkanbkrc

104mo ago

AI/ML●Mid

The platform layer for agentic ML engineering

Another MLOps platform competing with MLflow and Weights & Biases.

Ship It

iryna_kondr

401mo ago

Security●Mid

Fortress Language: Cybersecurity DSL

New security DSL with built-in recon primitives, but Python already does this.

Bold BetSlick

CzaxTanmay

204mo ago

AI/ML○Pass

Binary is no longer safe

Uses differential-property testing as an automated feedback loop to validate LLM-driven rewrites — that's the clever bit that turns flaky translations into repeatable refinement. The author targets a closed-source MUD DLL to avoid model memorization and walks through why raw assembly prompts failed and how decompiled C+tests + LLM translation to Rust succeeds. It's a thoughtful, slightly alarming demo with concrete techniques you can try yourself, not just vaporware.

WizardryBig BrainNiche Gem

seddonm1

305mo ago

Education○Pass

Practicing foreign language generating conversation on topic [video]

Video demo of a personal workflow, not a tool others can actually use.

julienreszka

301mo ago