A technical reference for engineers building and operating production OpenStreetMap data pipelines.
Cover the full ETL surface — from PBF and XML parsing, through
tag normalization and topology cleaning, all the way to
routing graph extraction, diff-based updates, and automated
validation rules.
Every page is written for mapping engineers, OSM contributors, GIS analysts, and Python ETL
developers who need deterministic, reproducible workflows at continental scale. Expect deep
binary-format dives, memory-aware streaming patterns, and rule-driven QA you can wire into Dask,
Ray, or plain old multiprocessing.
The content is organised in two pillars: a data fundamentals
track covering the OSM schema and serialization formats, and a workflow track
covering parsing, normalization, and routing-graph conversion.