Scholar Sidekick – citation verifier for the "real DOI, wrong paper"

Name: Scholar Sidekick – citation verifier for the "real DOI, wrong paper"
Availability: InStock
Author: ProductivePhys

by ProductivePhys·Jun 2, 2026·2 points·3 comments

Visit Project View on HN

AI Analysis

●●SolidSolve My ProblemNiche Gem

Catches real-DOI-wrong-paper hallucinations that Zotero and Citation Machine miss.

Strengths

•Verification feature directly addresses the Topaz et al. research on fabricated citations
•MCP server support lets AI agents call the API without API keys for light use
•10,000+ CSL styles and 9 export formats covers virtually every academic workflow

Weaknesses

•Citation formatting is a solved category with many free alternatives
•Verification only catches title mismatches, not whether the paper actually supports the claim

Post Description

One of the harder AI citation failures is quite simple: the identifier is real, but the citation is still fake. The DOI resolves, but to a different paper - not the paper the citation claims it is.

Topaz et al. reported their findings on citation hallucination in May in The Lancet. They scanned 2.5 million PubMed Central articles and estimated that 1 in 277 contained a fabricated citation. Some of their examples were this exact pattern: real identifier, fabricated title.

I originally built Scholar Sidekick as a formatter for my own use as a clinician-educator preparing talks, articles, etc. After reading the Topaz paper, I added a verifier to catch the most common pattern they found: a real identifier attached to the wrong paper.

My tool resolves the identifier, and then compares the title in your reference with the returned metadata (i.e. does this DOI, PMID, or arXiv ID actually point to the right paper?). It does not attempt to judge whether the cited paper actually supports the claim you make in your text. That still needs judgment, preferably human judgment.

I ran 350 previously unseen citations through the API once each in a test. It correctly identified all 37 fabricated references, but wrongly flagged 5 of 285 real references: 1.8% (95% CI 0.8–4.0%). (Plain similarity comparison, without the optional LLM screening - I would expect the LLM to rescue some of those borderline cases. A handful of citations returned no result on upstream timeouts and weren't scorable either way.) The test suite, results and failures are public, so you do not have to take my word for it. You can check them yourself.

The web version is free and anonymous. The REST API and MCP server use a RapidAPI key, with a free rate-limited tier and paid tiers above that. The MCP server is on npm, Smithery and Glama, and the Obsidian plugin is in the community store. Chrome/Firefox/Edge browser extensions in their stores as well.

I'm very open to feedback and look forward to hearing from anyone who tries it - what works? What fails? Thanks in advance.