Back to browse
A navigable map and recommender for 17M music entities

A navigable map and recommender for 17M music entities

by deppep·May 2, 2026·4 points·2 comments

AI Analysis

●●●BangerRabbit HoleWizardryBig Brain

Navigating 17M tracks on a single UMAP projection feels like exploring a galaxy.

Strengths
  • Training a word2vec model on 6M playlists to embed the entire music landscape is a massive, clever data engineering feat.
  • The 'spatial navigation' metaphor for music discovery works surprisingly well compared to standard list-based recommenders.
  • Running the inference engine on a cheap box after heavy cloud training shows smart cost optimization.
Weaknesses
  • UMMAP projections inherently distort global distances, so 'nearby' tracks might be mathematically distant in high dimensions.
  • Lacks deep metadata filtering; hard to find specific genres without manually panning the map.
Category
Target Audience

Music enthusiasts, data visualization fans, ML practitioners

Similar To

Every Noise at Once · Music-Map · Spotify Audio Features

Post Description

Hello HN,

This is toposonico, a music recommender and navigable map. At core it's a skipgram word2vec model trained over ~6M playlists. Tracks are embedded in a 128d space. Embeddings for albums, artists and labels are computed marginalizing over tracks. The 2D map was built with UMAP.

Both the model and UMAP were trained in the cloud over a NVIDIA A100. All things considered it cost me around ~50EUR, over two main training sessions and a few experiments. For the slippy map I experimented with a few libraries. Ended up with Maplibre GL JS. Loved working with it, kudos to their developers. For the recommender indexes I used FAISS, another fantastic piece of software. Pretty happy with the thing running on a small and cheap box.

Two things influenced me in making this. The first is decade-old idea: human navigation and exploration skills work in information spaces too. Many ML concepts fit this idea especially well. It would be nice to see more experiments in this direction. The second goes more like a story. Before moving out and selling my turntable, I used to visit record fairs and I always ended up finding something novel there. To find a new record you didn't sit down and listen to ten records in a row, selected based on a supposed model of your personality. You wandered around, speaking with dealers and looking through the crates they brought with them. The crates often leaned on some genre more than another, reflecting the dealer's history and taste. There were huge stalls and there were small ones. Some were crap. Finding oddities was very easy. I wanted to write something that felt like that.

Let me know what you think, if you find any bugs or if you have any idea for improving it.

Repo link: https://github.com/peppedilillo/toposonico

Similar Projects

EducationMid

Qavvali Wiki

Beautifully designed archive, but currently too sparse to compete with Wikipedia.

CozyNiche Gem
vishkk
5021d ago
OtherPass

Iceberg Map

Crowdsourced cop-spotting map with zero reports, unclear data model, and no differentiator.

aosmith
223mo ago