Back to browse
Satellite imagery object detection using text prompts

Satellite imagery object detection using text prompts

by eyasu6464·Mar 9, 2026·53 points·23 comments

AI Analysis

MidShip ItEye Candy

VLM-based satellite detection sounds good until you remember YOLO and specialized models handle occlusion better.

Strengths
  • Zero-shot prompting on satellite tiles is a clever pipeline—tile selection, WGS84 coordinate conversion, GeoJSON projection all sensible.
  • No login, browser-based demo lowers friction to trying it; clean UI for polygon drawing and layer management.
Weaknesses
  • Author explicitly admits specialized detectors (YOLO) handle occlusion better—so you're paying VLM inference latency for lower accuracy.
  • Satellite object detection is well-served by Maxar, Planet Labs, Esri, and open models like YOLO; unclear what new capability this adds.
Category
Target Audience

Geospatial analysts, environmental researchers, urban planners, commercial imagery users

Similar To

Planet Labs API · Esri ArcGIS · OpenCV + YOLO pipelines

Post Description

I built a browser-based tool for detecting objects in satellite imagery using vision-language models (VLMs). You draw a polygon on the map and enter a text prompt such as "swimming pools", "oil tanks", or "buses". The system scans the selected area tile-by-tile and returns detections projected back onto the map as GeoJSON.

Pipeline: select area and zoom level, split the region into mercantile tiles, run each tile with the prompt through a VLM, convert predicted bounding boxes to geographic coordinates (WGS84), and render the results back on the map.

It works reasonably well for distinct structures in a zero-shot setting. occluded objects are still better handled by specialized detectors like YOLO models.

There is a public demo and no login required. I am mainly interested in feedback on detection quality, performance tradeoffs between VLMs and specialized detectors, and potential real-world use cases.

Similar Projects

Infrastructure●●Solid

Real-Time Satellite Tracking and Intelligence Dashboard

GPU-accelerated 30K object rendering is impressive, but the space tracking category already has Heavens-Above and N2YO.

WizardryNiche Gem
keveenwong
103mo ago