GitHub Repository

A fully local Retrieval-Augmented Generation (RAG) implementation for querying 25 years of Swiss Teletext news (500k articles in German language)

11 starsJupyter Notebook

Gemma 4 based local RAG on 25 Years of news articles

Name: Gemma 4 based local RAG on 25 Years of news articles
Availability: InStock
Author: folli

by folli·Apr 3, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●SolidNiche GemBig Brain

500k-article Swiss Teletext corpus makes this RAG demo actually interesting.

Strengths

•Hybrid search combining vector + full-text for high-recall retrieval on real corpus
•Fully local execution with pgvector means no data leaves your machine
•Teletext's high-density summaries are genuinely clever source material for RAG

Weaknesses

•German-only corpus limits broader applicability and testing
•Proof of concept rather than general-purpose tool for other datasets

Post Description

A fully local Retrieval-Augmented Generation (RAG) implementation for querying 25 years of Swiss Teletext news (~500k articles in German language) - based on Deepmind's most recent Gemma model.

Why? I thought it's a cool type of dataset (short/high density news summaries) to test some local RAG approaches. Gemma 4 gives some impressive results, but could probably use some more tweaking on the system prompt.