1. Data sources

Transit schedules and geometry (GTFS): The work uses GTFS—the standard format for static transit schedules, obtained from Transitland: which routes exist, where stops are, what times vehicles are scheduled to arrive and depart, and (when provided) the path of each trip along the street network. These are published datasets from transit agencies or aggregators, not live vehicle tracking.

Combining operators: Where a city has more than one GTFS provider (e.g. several bus operators), feeds were combined so the city is analyzed as a single network rather than as isolated subsets.

2. Which trips and segments count (filters)

The goal is to describe typical weekday, urban bus service comparable across cities, without letting night express patterns, weekend-only service, one-off holiday schedules, or very long suburban spacing dominate the results.

Mode: Analysis is limited to bus (and any additional road-based modes explicitly allowed in the per-city settings). Rail, ferry, and other modes are excluded unless deliberately included for a city.

Weekdays: Only trips that belong to a Monday–Friday service pattern are kept. Depending on how each feed encodes calendars:

  • Feeds with a weekly calendar use weekday columns so Saturday and Sunday service is excluded.
  • Feeds that rely on dated exceptions treat removed service days so that special or holiday suspensions do not define “normal” weekday operation.

Geographic focus: For each city, a center point and a radius (kilometres) define the study area. Trips whose route path intersects that disk are retained so the emphasis stays on the core network around the defined center. Exact coordinates and radii are city-specific and are documented in the project’s parameter table (the authoritative list for any report that needs numbers per city).

Time of day (for link speeds and maps)Stop-to-stop segments are only used when both ends fall in the same daytime window in local schedule time: from 5:00 to 22:59 (5 a.m. through 11 p.m.). Segments crossing midnight in raw schedule notation are handled so times remain comparable. This focuses comparisons on the busy day and avoids stretching maps into the deepest night. It does not by itself remove all late-evening service within that window.

3. How speed is calculated

All speeds below are scheduled speeds: they come from timetabled run times and measured path lengths. They are not direct measurements from GPS on buses (unless you later add a separate data source and say so explicitly)

Trip-level speed: For a whole trip: take the path length along the route geometry from the first scheduled stop to the last scheduled stop, and divide by the scheduled elapsed time between those two points. Express the result in kilometres per hour. This summarizes how fast a full trip is planned to run, on average, along its shape.

Stop-to-stop (link) speed: For each pair of consecutive stops: take the scheduled time from departure (or arrival, as defined consistently in the pipeline) at the first stop to the corresponding time at the next stop, and divide into the length of the path along the road network between those stops (from clipped route geometry, aligned with OSM bus paths where used). Convert to km/h. These link speeds feed maps, hex grids, and comparisons to halts.

City-specific geometry: Feeds differ in how complete or consistent shape data are. Where shapes were missing, inconsistent, or hard to use, paths were handled per city—for example by building or substituting paths from OSM bus route relations when the published feed was insufficient (notably for some regions where full trip paths had to be reconstructed from the map).

4. Geographic speed grids

Hexagonal grid: The study area is covered with a hexagonal grid of cells spaced at roughly one kilometre between centres. Hexagons reduce edge effects compared with squares and give a stable geographic unit for comparison.

Linking segments to cells: Each stop-to-stop segment is intersected with the grid. Any segment that crosses or touches a cell contributes its link speed to that cell.

Aggregation: For each cell, the mean of all contributing segment speeds is computed. The pipeline also records how many segments contribute, so readers can treat sparse cells (few observations) with more caution than dense ones.

Time-of-day views: In addition to an overall mean speed per cell (over allowed hours), speeds are averaged within two-hour bands in local time, from early morning through late evening—for example 5:00–7:00, 7:00–9:00, …, 21:00–23:00, matching the options shown in the interactive atlas (mean speed plus banded views). The same cell can therefore show different values at different times of day.

5. Halts: traffic signals, stop signs, and bus routes

Downloading controls: For a bounding box around each city, point features for traffic signals and stop signs are taken from OpenStreetMap.

Merging duplicate map points: One real intersection is sometimes represented by several nearby nodes in OSM. Nearby points of the same control type are buffered, merged, and given a single identifier so each physical control is not counted multiple times. Traffic lights and stop signs use different buffer sizes (wider for lights, narrower for stop signs) so clustering behaves sensibly for each feature type.

Keeping controls on bus corridors: Bus route paths used in the analysis are buffered by a small distance along the road. Only controls that intersect those buffers are kept—reducing noise from signals on parallel or unrelated streets. Lights and stop signs use different buffer widths when matching to routes (wider for lights, narrower for stop signs), reflecting how precisely each needs to sit on the corridor.

Relating controls to service: Spatial joins link remaining control points to trips or routes that pass through the buffers, supporting summaries such as controls per distance or per route.

6. Limitations

  • Schedule vs. reality: Results reflect published timetables and path geometry, not actual GPS speeds. Real traffic, weather, incidents, and dwell time at stops are only reflected insofar as the schedule already embeds them. Cities with optimistic or outdated timetables will differ systematically from on-street experience.
  • Data quality varies: GTFS freshness, shape quality, and OSM completeness differ by city and by year. Cross-city comparisons should be read as comparisons of the same method applied to available data, not as ground-truth ranking of live operations. Some transit agencies use GTFS data in particular ways, with rounded schedules or stop-to-stop geometries. These were delt individually by adding additional data sources or removing the city from analysis if it can not be standardized.

Parameter-driven definitions: Center, radius, mode rules, and stop-spacing cutoffs are set per city in a shared parameter table. Any formal report should align prose and tables with that table for the analysis vintage you cite.