13 KiB
RFCP Iteration 3.4.0 — Large Radius Support (20-50km)
Goal
Enable 50km radius calculations without OOM by implementing memory-efficient processing patterns.
Current limitation: > 10-20km radius causes OOM (5+ GB RAM usage) Target: 50km radius with < 4GB RAM peak
Phase 1: Memory-Mapped Terrain
1.1 Terrain mmap Loading
Change terrain_service to use memory-mapped files instead of loading full arrays into RAM.
File: backend/app/services/terrain_service.py
# Before (loads ~25 MB per tile into RAM):
terrain = np.fromfile(f, dtype='>i2').reshape((rows, cols))
# After (near-zero RAM, OS pages from disk):
terrain = np.memmap(f, dtype='>i2', mode='r', shape=(rows, cols))
Expected impact: -200-400 MB RAM per tile
1.2 Terrain Disk Cache
- Save downloaded .hgt files to persistent disk cache
- Don't keep raw arrays in memory after initial processing
- Implement LRU eviction if cache exceeds 2GB
- Location:
~/.rfcp/terrain_cache/
Phase 2: Tile-Based Processing
2.1 Split Large Calculations
If radius > 10km, split calculation area into 5km sub-tiles.
File: backend/app/services/coverage_service.py (or new tile_processor.py)
def calculate_coverage_tiled(site, radius_m, resolution_m, settings):
"""Tile-based calculation for large radius."""
# Small radius — use existing single-pass
if radius_m <= 10000:
return calculate_coverage_single(site, radius_m, resolution_m, settings)
# Large radius — split into tiles
TILE_SIZE = 5000 # 5km tiles
tiles = generate_tile_grid(site.lat, site.lon, radius_m, TILE_SIZE)
all_results = []
for i, tile in enumerate(tiles):
log(f"Processing tile {i+1}/{len(tiles)}: {tile.bbox}")
# Load data for this tile only
tile_terrain = load_terrain_for_bbox(tile.bbox)
tile_buildings = load_buildings_for_bbox(tile.bbox)
# Calculate coverage for tile
tile_points = generate_grid_for_tile(tile, resolution_m)
tile_results = calculate_points(tile_points, site, settings,
tile_terrain, tile_buildings)
all_results.extend(tile_results)
# Free memory
del tile_terrain, tile_buildings
gc.collect()
# Report progress
progress = (i + 1) / len(tiles) * 100
yield_progress(progress, f"Tile {i+1}/{len(tiles)}")
return merge_and_dedupe_results(all_results)
def generate_tile_grid(center_lat, center_lon, radius_m, tile_size_m):
"""Generate grid of tiles covering the calculation area."""
tiles = []
# Calculate bbox of full area
lat_delta = radius_m / 111000
lon_delta = radius_m / (111000 * cos(radians(center_lat)))
# Generate tile grid
n_tiles = ceil(radius_m * 2 / tile_size_m)
for i in range(n_tiles):
for j in range(n_tiles):
tile_bbox = calculate_tile_bbox(center_lat, center_lon,
i, j, n_tiles, tile_size_m)
# Only include tiles that intersect with coverage circle
if tile_intersects_circle(tile_bbox, center_lat, center_lon, radius_m):
tiles.append(Tile(bbox=tile_bbox, index=(i, j)))
return tiles
2.2 Progressive Results via WebSocket
Send results per-tile as they complete, so user sees coverage growing.
File: backend/app/api/websocket.py
async def calculate_coverage_ws(websocket, params):
for tile_results in calculate_coverage_tiled_generator(params):
# Send partial results
await websocket.send_json({
"type": "partial_results",
"points": tile_results.points,
"progress": tile_results.progress,
"tile": tile_results.tile_index,
"status": f"Tile {tile_results.tile_index} complete"
})
# Final message
await websocket.send_json({
"type": "complete",
"total_points": total_points,
"computation_time": elapsed
})
Phase 3: SQLite Cache for OSM Data
3.1 Create Local Database
Replace in-memory OSM cache with SQLite database with spatial indexing.
File: backend/app/services/cache_db.py (NEW)
import sqlite3
import json
class OSMCacheDB:
def __init__(self, db_path="~/.rfcp/osm_cache.db"):
self.conn = sqlite3.connect(db_path)
self._init_tables()
def _init_tables(self):
self.conn.executescript("""
CREATE TABLE IF NOT EXISTS buildings (
id INTEGER PRIMARY KEY,
osm_id TEXT UNIQUE,
lat REAL NOT NULL,
lon REAL NOT NULL,
height REAL DEFAULT 10.0,
geometry TEXT, -- GeoJSON
cell_key TEXT, -- grid cell for batch loading
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX IF NOT EXISTS idx_buildings_lat ON buildings(lat);
CREATE INDEX IF NOT EXISTS idx_buildings_lon ON buildings(lon);
CREATE INDEX IF NOT EXISTS idx_buildings_cell ON buildings(cell_key);
CREATE TABLE IF NOT EXISTS vegetation (
id INTEGER PRIMARY KEY,
osm_id TEXT UNIQUE,
lat REAL NOT NULL,
lon REAL NOT NULL,
type TEXT,
geometry TEXT,
cell_key TEXT
);
CREATE INDEX IF NOT EXISTS idx_veg_lat ON vegetation(lat);
CREATE INDEX IF NOT EXISTS idx_veg_lon ON vegetation(lon);
-- Metadata for cache invalidation
CREATE TABLE IF NOT EXISTS cache_meta (
cell_key TEXT PRIMARY KEY,
data_type TEXT,
fetched_at TIMESTAMP,
item_count INTEGER
);
""")
self.conn.commit()
def query_buildings_bbox(self, min_lat, max_lat, min_lon, max_lon, limit=20000):
"""Query buildings within bounding box."""
cursor = self.conn.execute("""
SELECT osm_id, lat, lon, height, geometry
FROM buildings
WHERE lat BETWEEN ? AND ?
AND lon BETWEEN ? AND ?
LIMIT ?
""", (min_lat, max_lat, min_lon, max_lon, limit))
return [self._row_to_building(row) for row in cursor]
def insert_buildings(self, buildings, cell_key):
"""Bulk insert buildings from OSM fetch."""
self.conn.executemany("""
INSERT OR IGNORE INTO buildings
(osm_id, lat, lon, height, geometry, cell_key)
VALUES (?, ?, ?, ?, ?, ?)
""", [
(b['id'], b['lat'], b['lon'], b.get('height', 10),
json.dumps(b.get('geometry')), cell_key)
for b in buildings
])
self.conn.commit()
def is_cell_cached(self, cell_key, data_type, max_age_hours=24):
"""Check if cell data is cached and fresh."""
cursor = self.conn.execute("""
SELECT fetched_at FROM cache_meta
WHERE cell_key = ? AND data_type = ?
AND fetched_at > datetime('now', ?)
""", (cell_key, data_type, f'-{max_age_hours} hours'))
return cursor.fetchone() is not None
3.2 Update OSM Client
Modify OSM client to use SQLite cache.
File: backend/app/services/osm_client.py
class OSMClient:
def __init__(self):
self.cache_db = OSMCacheDB()
def get_buildings(self, bbox, max_count=20000):
min_lat, min_lon, max_lat, max_lon = bbox
cell_key = self._bbox_to_cell_key(bbox)
# Check cache first
if self.cache_db.is_cell_cached(cell_key, 'buildings'):
return self.cache_db.query_buildings_bbox(
min_lat, max_lat, min_lon, max_lon, max_count
)
# Fetch from Overpass API
buildings = self._fetch_from_overpass(bbox, 'buildings')
# Store in cache
self.cache_db.insert_buildings(buildings, cell_key)
return buildings[:max_count]
Phase 4: Worker Memory Optimization
4.1 Per-Tile Building Loading
Workers receive only tile bbox and query buildings themselves (or receive pre-filtered list).
def _pool_worker_tiled(args):
"""Worker that loads buildings for its tile only."""
tile_bbox, terrain_shm_refs, config = args
# Load only buildings for this tile
cache_db = OSMCacheDB()
buildings = cache_db.query_buildings_bbox(*tile_bbox, limit=5000)
# Much smaller memory footprint per worker
# ...rest of calculation
4.2 Adaptive Worker Count
Reduce workers for large radius to prevent combined memory explosion.
def get_worker_count_for_radius(radius_m, base_workers):
"""Scale down workers for large calculations."""
if radius_m > 30000:
return min(base_workers, 2)
elif radius_m > 20000:
return min(base_workers, 3)
elif radius_m > 10000:
return min(base_workers, 4)
return base_workers
Phase 5: Frontend Progressive Rendering
5.1 Accumulate Partial Results
File: frontend/src/store/coverage.ts
interface CoverageState {
points: CoveragePoint[];
isCalculating: boolean;
progress: number;
// NEW:
partialResults: CoveragePoint[];
tilesCompleted: number;
totalTiles: number;
}
// Handle partial results
case 'partial_results':
set(state => ({
partialResults: [...state.partialResults, ...message.points],
progress: message.progress,
tilesCompleted: state.tilesCompleted + 1
}));
break;
case 'complete':
set(state => ({
points: state.partialResults, // Finalize
partialResults: [],
isCalculating: false
}));
break;
5.2 Incremental Heatmap Render
File: frontend/src/components/map/CoverageHeatmap.tsx
function CoverageHeatmap() {
const { points, partialResults, isCalculating } = useCoverageStore();
// Show partial results while calculating
const displayPoints = isCalculating ? partialResults : points;
// Throttle re-renders during streaming (every 500 points)
const throttledPoints = useThrottle(displayPoints, 500);
return <HeatmapLayer points={throttledPoints} />;
}
Implementation Order
Priority 1 — Biggest Impact
- Tile-based processing (Phase 2.1) — enables large radius
- SQLite cache (Phase 3) — reduces memory, speeds up repeated calcs
Priority 2 — Memory Reduction
- Terrain mmap (Phase 1.1) — easy win, minimal code change
- Per-tile building loading (Phase 4.1)
Priority 3 — UX Improvement
- Progressive WebSocket (Phase 2.2)
- Frontend streaming (Phase 5)
Priority 4 — Polish
- Terrain disk cache (Phase 1.2)
- Adaptive worker count (Phase 4.2)
Success Criteria
| Radius | Max Time | Max RAM |
|---|---|---|
| 20 km | < 3 min | < 3 GB |
| 30 km | < 5 min | < 3.5 GB |
| 50 km | < 10 min | < 4 GB |
- No OOM crashes at any radius up to 50km
- Progressive results visible within 30s of starting
- Cache reuse speeds up repeated calculations 5-10x
Files to Modify
Backend (Python)
| File | Changes |
|---|---|
terrain_service.py |
mmap loading, disk cache |
coverage_service.py |
tile-based routing |
parallel_coverage_service.py |
adaptive workers |
osm_client.py |
SQLite integration |
websocket.py |
streaming results |
NEW tile_processor.py |
tile generation & processing |
NEW cache_db.py |
SQLite cache layer |
Frontend (TypeScript)
| File | Changes |
|---|---|
store/coverage.ts |
partial results handling |
CoverageHeatmap.tsx |
incremental rendering |
App.tsx |
progress for tiled calc |
Testing
# Test 20km radius
curl -X POST http://localhost:8888/api/coverage/calculate \
-H "Content-Type: application/json" \
-d '{"radius": 20000, "resolution": 500, "preset": "standard"}'
# Monitor memory
watch -n 1 'ps aux | grep rfcp-server | awk "{print \$6/1024\" MB\"}"'
# Test 50km radius
curl -X POST http://localhost:8888/api/coverage/calculate \
-H "Content-Type: application/json" \
-d '{"radius": 50000, "resolution": 1000, "preset": "standard"}'
Notes
- Tile size 5km is a balance — smaller = more overhead, larger = more memory
- SQLite R-tree extension would be faster but requires compilation
- For Rust version, all of this will be native and faster
"Think in tiles, stream results, cache everything" 🗺️