We fingerprinted 178 AI models' writing styles and similarity clusters

Name: We fingerprinted 178 AI models' writing styles and similarity clusters
Availability: InStock
Author: nuancedev

by nuancedev·Apr 8, 2026·78 points·22 comments

Visit Project View on HN

AI Analysis

●●●BangerBig BrainDark HorseSolve My Problem

Found Gemini Flash writes 78% like Claude Opus but costs 185x less.

Strengths

•3,095 analyzed responses across 43 prompts gives statistically meaningful data.
•Price arbitrage findings are immediately actionable for cost-conscious teams.
•Cross-provider twin detection reveals training data convergence patterns.

Weaknesses

•Stylometric fingerprinting may not capture reasoning or code quality differences.
•Findings could become stale as models update frequently.

Post Description

We have a dataset of 3,095 standardized AI responses across 43 prompts. From each response, we extract a 32-dimension stylometric fingerprint (lexical richness, sentence structure, punctuation habits, formatting patterns, discourse markers).

Some findings:

- 9 clone clusters (>90% cosine similarity on z-normalized feature vectors) - Mistral Large 2 and Large 3 2512 score 84.8% on a composite metric combining 5 independent signals - Gemini 2.5 Flash Lite writes 78% like Claude 3 Opus. Costs 185x less - Meta has the strongest provider "house style" (37.5x distinctiveness ratio) - "Satirical fake news" is the prompt that causes the most writing convergence across all models - "Count letters" causes the most divergence

The composite clone score combines: prompt-controlled head-to-head similarity, per-feature Pearson correlation across challenges, response length correlation, cross-prompt consistency, and aggregate cosine similarity.

Tech: stylometric extraction in Node.js, z-score normalization, cosine similarity for aggregate, Pearson correlation for per-feature tracking. Analysis script is ~1400 lines.