The complete Open Library catalog in clean, analysis-ready Parquet

Name: The complete Open Library catalog in clean, analysis-ready Parquet
Availability: InStock
Author: tamnd

by tamnd·Mar 24, 2026·1 point·0 comments

AI Analysis

●●●BangerSolve My ProblemBig Brain

Clean Parquet dump of 55M Open Library rows saves weeks of data cleaning.

Strengths

Weaknesses

Data●●Solid

518k Vietnamese legal documents fill a massive gap in Southeast Asian NLP datasets.

Niche GemDark Horse

th1nhng0

302mo ago

Data●●Solid

47M HN items in Parquet, auto-updating every 5 minutes on Hugging Face.

Niche GemSolve My Problem

tamnd

4081673mo ago

Data●Mid

Pre-cleaned ArXiv metadata in Parquet saves hours of ETL pipeline work.

CozyNiche Gem

tamnd

402mo ago

AI/ML●●Solid

Cross-platform dataset search with health scores when Kaggle and HF are fragmented.

Solve My ProblemSlick

nasibahd

102mo ago

AI/ML●●Solid

MCP-native tool lets AI agents fetch and clean datasets without human intervention.

Niche GemSolve My ProblemShip It

sultanchek

202mo ago

AI/ML●●Solid

MCP server lets agents autonomously build ML datasets from search to export without manual work.

Big BrainShip It

sultanchek

103mo ago