Handling large GeoJSON files in Leafmap without browser lag
To prevent browser freeze when rendering massive spatial datasets, never pass raw GeoJSON directly to the client. Instead, apply server-side geometry simplification, strip non-essential properties, and stream the data as vector tiles or chunked payloads. In Streamlit and Panel environments, pair geopandas with Leafmap’s rendering pipeline or convert the dataset to Mapbox Vector Tiles (MVT) and render via add_tile_layer(). Cache the processed payload using framework-level decorators to bypass repeated serialization, and enforce a hard cap of ~80,000 client-side features to stay within the browser’s main-thread execution budget. Mastering these techniques is foundational to successful Spatial Component Integration & Interactive Maps in modern Python dashboards.
Why Raw GeoJSON Freezes the Browser
Leafmap wraps ipyleaflet and folium, which ultimately delegate rendering to Leaflet.js. When a GeoJSON payload exceeds ~15–20 MB, three bottlenecks trigger:
- Synchronous JSON parsing blocks the main thread, freezing UI responsiveness until the entire string is deserialized.
- DOM node explosion occurs as Leaflet creates SVG/Canvas elements for every coordinate and feature. Chromium and WebKit enforce a ~2 GB per-tab memory ceiling, which large geometries quickly exhaust.
- Redundant coordinate precision retains millimeter-level floats that provide zero visual benefit at dashboard zoom levels, inflating payload size by 30–50%.
The GeoJSON specification (RFC 7946) explicitly recommends simplification and coordinate reduction for web delivery. Ignoring this forces the browser to parse and paint geometries that will never be visible at the current viewport scale.
Production-Ready Optimization Pipeline
The following Streamlit-compatible pipeline demonstrates how to preprocess, cache, and render large spatial datasets without triggering client-side lag. It uses geopandas for topology-aware simplification and writes a simplified GeoJSON file that Leafmap reads from disk. This approach aligns with best practices for Folium & Leafmap Integration in production environments.
import streamlit as st
import geopandas as gpd
import leafmap.foliumap as leafmap
import os
@st.cache_data(ttl=3600, max_entries=5)
def optimize_geojson(input_path: str, tolerance: float = 0.001, max_features: int = 80000) -> str:
"""
Load, simplify, and export a GeoJSON optimized for browser rendering.
Returns path to cached optimized file.
"""
# 1. Load dataset
gdf = gpd.read_file(input_path)
# 2. Enforce feature cap to prevent DOM overload
if len(gdf) > max_features:
gdf = gdf.iloc[:max_features].copy()
# 3. Ensure WGS84 before simplification (tolerance is in degrees at EPSG:4326)
if gdf.crs and gdf.crs.to_epsg() != 4326:
gdf = gdf.to_crs(epsg=4326)
# 4. Topology-preserving simplification (Douglas-Peucker)
gdf["geometry"] = gdf.geometry.simplify(tolerance, preserve_topology=True)
# 5. Drop heavy metadata columns; keep only what the UI needs
keep_cols = ["geometry"] + [c for c in ["id", "name", "type"] if c in gdf.columns]
gdf = gdf[keep_cols]
# 6. Export to cache directory
cache_dir = "/tmp/geojson_cache"
os.makedirs(cache_dir, exist_ok=True)
cache_path = os.path.join(cache_dir, f"opt_{os.path.basename(input_path)}")
gdf.to_file(cache_path, driver="GeoJSON")
return cache_path
# Streamlit UI integration
st.title("Optimized Spatial Dashboard")
uploaded = st.file_uploader("Upload GeoJSON", type=["geojson"])
if uploaded:
# Save temporarily for geopandas processing
temp_path = f"/tmp/{uploaded.name}"
with open(temp_path, "wb") as f:
f.write(uploaded.getbuffer())
optimized_path = optimize_geojson(temp_path, tolerance=0.001)
m = leafmap.Map(center=[40.0, -100.0], zoom=4)
m.add_geojson(optimized_path, layer_name="Optimized Features")
m.to_streamlit(height=600)
Note on coordinate rounding: The geometry simplification step with tolerance=0.001 (degrees) achieves ~100 m precision at mid-latitudes, which is adequate for most dashboard zoom levels. Applying additional manual coordinate rounding (round(x, 5)) to non-Point geometries requires geometry-type-specific handling; the simplify step already achieves the goal more safely by using GEOS’s Douglas-Peucker algorithm.
Advanced Strategies: Vector Tiles & Chunked Delivery
When datasets exceed 50 MB or contain complex polygons, even simplified GeoJSON will strain client memory. Switch to Mapbox Vector Tiles (MVT) for dynamic, zoom-level-aware rendering. MVTs clip geometries to tile boundaries, drastically reducing paint overhead.
Use tippecanoe to pre-bake tiles from GeoJSON:
tippecanoe -o tiles.mbtiles --drop-densest-as-needed input.geojson
Then serve them via a lightweight HTTP server or cloud storage and render with:
m.add_tile_layer(
url="https://your-server/tiles/{z}/{x}/{y}.pbf",
layer_name="Vector Tiles",
attribution="Custom MVT"
)
For real-time or user-filtered data, implement chunked GeoJSON streaming. Split the dataset into 10k-feature batches, load them sequentially, and append to the Leaflet layer using L.geoJSON().addData(). This keeps the main thread responsive and allows progressive rendering.
Hard Limits, Caching, and Framework Integration
Browser performance degrades predictably past specific thresholds. Adhere to these operational limits:
| Metric | Safe Limit | Consequence of Exceeding |
|---|---|---|
| Client-side features | ~80,000 | Main-thread jank, scroll lag |
| Payload size | 15–20 MB | JSON parse timeout, OOM crash |
| Coordinate precision | 5–6 decimals | ~1m accuracy, 40% smaller payload |
| Cache TTL | 1–4 hours | Balances freshness vs. compute cost |
Leverage framework-level caching aggressively. In Streamlit, @st.cache_data serializes the optimized GeoJSON path, skipping redundant geopandas I/O on subsequent renders. In Panel, use @pn.cache with ttl and max_items to prevent memory leaks. Always validate CRS consistency before simplification; running Douglas-Peucker on projected coordinates without converting to WGS84 first will distort geometries at dashboard zoom levels.
When building internal tooling, prioritize server-side preprocessing over client-side filtering. The browser should only receive what is visible, simplified, and stripped of non-rendering metadata. This architecture scales from prototype to enterprise deployment without requiring WebGL or custom rendering pipelines.