I solved my study problems by talking to a goose
Feynman technique via voice chat with a goose avatar that actually works.

Understanding meter that drops when you hand-wave instead of explaining mechanisms.
Students, self-learners, professionals preparing for interviews or exams
Khanmigo · Quizlet Q-Chat · Anki with AI plugins
that begs the question, how does the goose know it has understood?
that’s when I thought of an understanding bar - always available to the user to help visualize how much the goose understands you, 0 -> 100%.
the original logic powering the understanding bar went something like this: every turn, id send the convo to an llm and ask it to return a number 0-100 , with a rubric of brackets to make the output less volatile. 0-10 meant no real understanding. 11-20 named , but empty. 21-35 meant a partial understanding, and so on, up to 93-100 for the goose understanding your topic exceptionally. this approach worked. mostly. until I started looking at what came back once real users tested the goose.
two testers were explaining the basic way a cpu works. the first used textbook style definition, (fetch, decode , execute etc) and got a final understanding of 87% after a couple turns. the second used a real world example of a chef, linking it to concepts of a cpu. same level of understanding, expressed differently. the second tester got a score of 36. id built the opposite of what I wanted, a tutor rewarding parroting.
checking into the data to find the source of the variances I noticed if I put the same paragraph verbatim in, and got 5 varying scores out: 51,66,51,70,51. the brackets kind of stabilized the results, but the score was unexplainable. why 66 and not 70? nothing in the system could tell me, the limit just picked.
the fix was to stop adding the model to be the math , and make a new system. now every session gets a ‘flight plan’ when the session has a meaningful topic. a separate llm call generates 3-4 essential subconcepts a real explanation must cover. eg for photosynthesis: what it uses, what it produces, why plants need it. each turn the goose’s evaluator returns discrete depth updates per waypoint (0-3, from not addressed, named, stated, explained in own words), plus any misconceptions which were spotted. Javascript makes sure depth only moves up (like a ratchet), weighted coverage, the gate to finish(wrap) a session, and the flow to repair a misconception.
what if the user introduces a subtopic the the plan didn’t anticipate?
in that case, the system decides whether to amend the plan mid session, with a backfill evaluation to credit prior turns. i also added 5 levels of intelligence to the goose, (breezy to razor sharp) which each make the model judge objective depth, then code decides what’s enough. the same chef analogy now scores 87, because the evaluation prompt explicitly tells the llm the waypoints ideal answer is just a valid framing, not the only one.
to validate these changes, I sat down and acted as 15 different types of users, typing differently explaining differently etc, then made changes based on response and iterated. a little bug I found was the llm evaluator giving credit to the wrong actor - the goose teaching via analogy and the student getting credit for it, fixed that too.
lesson worth keeping: if you build anything an llm needs to rate or rank by number, don’t trust it, give it something discrete, not subjective, otherwise they will fake and hallucinate.
professor goose is live if you want to try it!
Feynman technique via voice chat with a goose avatar that actually works.
Socratic AI tutor, but ChatGPT already refuses answers—positioning isn't enough.
Educational AI wrapper—Duolingo, Khan Academy AI, and paid tutors already own this space.
Camera watches your actual homework while AI asks questions instead of giving answers.
The core idea—make the assistant refuse to write code until you prove you understand the problem—is a sharp behavioral hack that could really change how people use copilots. The repo/article shows a mode table (temperature, allowed tools like write/edit/bash/read/grep) and an explicit permission model, which is a useful blueprint. Right now it reads like a well-argued workflow and config proposal rather than a plug-and-play tool: I'd want an editor/CLI integration or enforcement layer before I call it a must-use.
Persistent context layer beats Cursor's session amnesia on large codebases.