Back to browse
A 150M model that extracts verbatim evidence spans for RAG, no LLM call

A 150M model that extracts verbatim evidence spans for RAG, no LLM call

by justacoolname·Jun 10, 2026·4 points·0 comments

AI Analysis

●●SolidNiche GemBig BrainSolve My Problem

150M model replaces LLM calls for evidence extraction with comparable F1 scores.

Strengths
  • 150M params vs LLM calls means deterministic, cheap, local inference for production RAG.
  • Trained on financial tables, legal contracts, medical docs—not just Wikipedia QA like competitors.
  • 8192 token ModernBERT context handles long passages without chunking overhead.
Weaknesses
  • Other extractors exist (Provence, Zilliz Semantic-Highlight)—category isn't empty.
  • Benchmark claims focus on ACL gold; real-world RAG performance less proven.
Category
Target Audience

ML engineers building RAG systems, teams needing cheaper evidence extraction than LLM calls

Similar To

Zilliz Semantic-Highlight · Provence · MultiSpanQA

Similar Projects