Back to browse
NPIScan search 9M U.S. healthcare providers from the NPI registry

NPIScan search 9M U.S. healthcare providers from the NPI registry

by bas_sen·Mar 6, 2026·2 points·0 comments

AI Analysis

●●SolidSolve My ProblemDark Horse

Makes 9M NPI records actually browsable with geo/specialty drill-down and weekly sync.

Strengths
  • Real-time NPPES sync (weekly updates) ensures accuracy for credentialing and compliance workflows
  • Geo-drill (state→city→ZIP) plus specialty filters solve the CMS lookup's one-at-a-time friction
  • Proprietary ranking score (experience, licensing, completeness) adds value over raw NPPES data
Weaknesses
  • Limited to read-only search; no verification API or bulk export for programmatic access limits integration potential
  • Healthcare provider directories exist (ZocDoc, Healthgrades)—differentiation is NPI-centric, not broader market
Category
Target Audience

Healthcare recruiters, insurance companies, provider credentialing teams, health researchers

Similar To

ZocDoc · Healthgrades · Doximity

Post Description

I’ve been exploring the NPPES dataset, the federal registry that assigns NPI numbers to every healthcare provider in the U.S. It currently has about 9 million records and grows by ~30k per month, but accessing it usually means downloading multi-gigabyte CSVs or using the CMS lookup that returns one provider at a time.

I built NPIScan to make the dataset browsable. You can search by name, NPI, specialty, or location and drill down from state → city → ZIP code. Each provider has a profile with credentials, practice locations, taxonomy codes, and digital health endpoints.

A few interesting patterns from the data:

- 2025 had ~631k new NPI registrations, the largest jump on record

- Behavior Technicians grew to ~526k providers and are now among the largest specialties

- California alone has ~1.1M providers (~12% of the country)

- Only ~0.5% of providers have registered digital health endpoints

Tech stack: Next.js, PostgreSQL, Meilisearch, Redis. The main challenge was making 9M records feel fast to browse. I solved it with denormalized listing tables, Meilisearch full-text search, and Redis caching for aggregated queries. Most pages respond in <40ms after cache warmup.

Curious to hear feedback from anyone working with healthcare data.

Similar Projects