Back to browse
GitHub Repository

Ollama Model Test - Figure out the best model for the task

0 starsPython

OMT – A simple Python CLI for testing local Ollama models

by bethanyhunt·Jun 4, 2026·2 points·0 comments

AI Analysis

MidCozyNiche Gem

Prompt-hashed folders make model comparison easy, but Ollama testing tools already exist.

Strengths
  • Zero dependencies — pure Python standard library, no pip install needed
  • Prompt-hashed folder structure auto-groups runs for side-by-side comparison
  • Records timing and token counts alongside each response for benchmarking
Weaknesses
  • Basic wrapper around Ollama API without novel techniques or architecture
  • No statistical analysis or visualization — just raw output files to compare manually
Target Audience

Developers testing local LLM models

Similar To

lm-evaluation-harness · fmbench · llm-benchmark

Post Description

Selecting the "best" local model usually depends on the task and the hardware.

I created this script as an easy way to test local Ollama models and keep the test output organized.

When you run the script interactively, it asks which model you want to use, what your prompt is, how many times you want to run it, and (optional) the temperature you'd like to set. It can also be scripted with command-line flags.

The output is saved in Markdown/JSON within an organized file structure for easy comparison. Outputs using the same prompt go into a folder together, each output named for the model tested. Timing data and token counts are also recorded.

The tool is intentionally small and dependency-free (standard library only).

Suggestions welcome.

Similar Projects

AI/ML●●Solid

Prompt Builder – A block-based editor for composing AI prompts

The block metaphor and live compiled preview are honest, practical improvements for anyone wrestling with long, conditional prompts — toggles for A/B testing and global {{vars}} are especially handy. Multi-model execution and editable response panes show the author thought about iteration and comparison, but the screenshot feels safe and functional rather than boldly new; I want to know how it handles collaboration, exports, and model/credit management.

Solve My ProblemNiche Gem
Jaber_Said
103mo ago
Developer Tools●●●Banger

Timber – Ollama for classical ML models, 336x faster than Python

336× faster tree model inference; compiles sklearn/XGBoost to C99, serves like Ollama.

WizardrySolve My Problem
kossisoroyce
207333mo ago