@mytec: feat: Phase 3.0 Architecture Refactor ✅

Major refactoring of RFCP backend: - Modular propagation models (8 models) - SharedMemoryManager for terrain data - ProcessPoolExecutor parallel processing - WebSocket progress streaming - Building filtering pipeline (351k → 15k) - 82 unit tests Performance: Standard preset 38s → 5s (7.6x speedup) Known issue: Detailed preset timeout (fix in 3.1.0)
2026-02-01 23:12:26 +02:00
parent 1dde56705a
commit defa3ad440
71 changed files with 7134 additions and 256 deletions
--- a/RFCP-Session-Summary-2025-02-01.md
+++ b/RFCP-Session-Summary-2025-02-01.md
@@ -0,0 +1,233 @@
+# RFCP Development Session Summary
+## Date: February 1, 2025 (actually 2026)
+## Status: Phase 3.0 Complete, Performance Optimization Ongoing
+
+---
+
+## 🎯 Project Overview
+
+**RFCP (Radio Frequency Coverage Planning)** — desktop application for tactical LTE network planning, part of UMTC (Ukrainian Military Tactical Communications) project.
+
+**Tech Stack:**
+- Backend: Python/FastAPI + NumPy + ProcessPoolExecutor
+- Frontend: React + TypeScript + Vite
+- Desktop: Electron
+- Build: PyInstaller (backend), electron-builder (desktop)
+
+**Goal:** Calculate RF coverage maps with terrain, buildings, vegetation analysis.
+
+---
+
+## ✅ What Works (Phase 3.0 Achievements)
+
+### Performance
+| Preset | Before | After | Status |
+|--------|--------|-------|--------|
+| Standard (100-200m res) | 38s | **~5s** | ✅ EXCELLENT |
+| Detailed (300m, 5km) | timeout | timeout | ❌ Still broken |
+
+### Architecture (48 new files, 82 tests)
+- ✅ Modular propagation models (8 models: FreeSpace, Okumura-Hata, COST-231, ITU-R P.1546, etc.)
+- ✅ SharedMemoryManager for terrain data (zero-copy, 25 MB)
+- ✅ Building filtering (351k → 27k bbox → 15k cap)
+- ✅ WebSocket progress streaming (backend works)
+- ✅ Clean model selection by frequency/environment
+- ✅ Worker cleanup on shutdown
+- ✅ Overpass API retry with failover (3 attempts, mirror endpoint)
+
+### New Files Structure
+```
+backend/app/
+├── propagation/     # 8 model files
+├── geometry/        # 5 files (haversine, intersection, reflection, diffraction, los)
+├── core/            # 4 files (engine, grid, calculator, result)
+├── parallel/        # 3 files (manager, worker, pool)
+├── services/        # cache.py, osm_client.py
+├── utils/           # logging.py, progress.py, units.py
+└── api/websocket.py
+
+frontend/src/
+├── hooks/useWebSocket.ts
+├── services/websocket.ts
+└── components/FrequencyBandPanel.tsx
+```
+
+---
+
+## ❌ Current Blockers
+
+### 1. Detailed Preset Timeout (CRITICAL)
+
+**Symptom:** 300s timeout, only 194/868 points calculated
+
+**Latest test results:**
+```
+[DOMINANT_PATH_VEC] Point #1: buildings=30, walls=214, dist=4887m
+302.8ms/point × 868 points = 262 seconds
+```
+
+**Root Cause Analysis:**
+- Early return fix (Claude Code) was for `buildings=[]` case
+- But in reality, buildings ARE present (15,000 after cap)
+- Each point finds 17-30 nearby buildings
+- Each building has 100-295 wall segments
+- **dominant_path_service** geometry calculations are expensive
+
+**The real problem is NOT "buildings=0 is slow"**
+**The real problem IS "dominant_path with buildings is inherently slow"**
+
+**Potential solutions:**
+1. Simplify building geometry (reduce wall count)
+2. Use spatial indexing more aggressively
+3. Skip dominant_path for distant points (>3km?)
+4. Reduce building query radius
+5. Use simpler path loss model when buildings present
+6. GPU acceleration (CuPy) for geometry
+
+### 2. Progress Bar Stuck at "Initializing 5%"
+
+**Symptom:** UI shows "Initializing 5%" forever
+
+**Fix attempted:** `await asyncio.sleep(0)` after progress_fn() — not working
+
+**Likely cause:** Frontend WebSocket connection or state update issue
+
+### 3. App Close Broken
+
+**Symptom:** Clicking X kills backend but frontend stays open
+
+**Partial fix:** Worker cleanup works, but Electron window doesn't close
+
+### 4. Memory Not Released
+
+**Symptom:** 1328 MB not freed after calculation
+```
+Before: 3904 MB free
+After:  2576 MB free
+```
+
+---
+
+## 📊 Performance Analysis
+
+### Why Detailed is slow (the math):
+
+```
+Points: 868
+Buildings nearby per point: ~25 average
+Walls per building: ~150 average
+Wall intersection checks: 868 × 25 × 150 = 3,255,000
+
+At 0.1ms per check = 325 seconds
+```
+
+### Why Standard is fast:
+
+- Lower resolution = fewer points (~200 vs 868)
+- Likely skips some detailed calculations
+- Buildings still processed but fewer points to check
+
+---
+
+## 🔧 Key Files to Review
+
+### Backend (performance critical)
+```
+backend/app/services/
+├── dominant_path_service.py    # THE BOTTLENECK
+├── coverage_service.py         # Orchestration, progress
+├── parallel_coverage_service.py # Worker management
+└── buildings_service.py        # OSM fetch, caching
+```
+
+### Frontend (UI bugs)
+```
+frontend/src/
+├── App.tsx                     # Progress display
+├── store/coverage.ts           # WebSocket state
+└── services/websocket.ts       # WS connection
+```
+
+### Desktop (close bug)
+```
+desktop/main.js                 # Electron lifecycle
+```
+
+---
+
+## 🎯 Recommended Next Steps
+
+### Priority 1: Fix Detailed Performance
+
+**Option A: Aggressive spatial filtering**
+```python
+# In dominant_path_service.py
+# Only check buildings within line-of-sight corridor
+# Not all buildings within radius
+```
+
+**Option B: LOD (Level of Detail)**
+```python
+# Distance > 2km: skip dominant path entirely
+# Distance 1-2km: simplified model
+# Distance < 1km: full calculation
+```
+
+**Option C: Building simplification**
+```python
+# Reduce wall count per building
+# Merge adjacent buildings
+# Use bounding boxes instead of polygons for far buildings
+```
+
+### Priority 2: Fix UI Bugs
+- Debug WebSocket in browser DevTools
+- Check Electron close handler
+
+### Priority 3: Memory
+- Explicit cleanup after calculation
+- Check for leaked references
+
+---
+
+## 📝 Session Timeline
+
+1. **Phase 2.4-2.5.1** — Vectorization attempt (didn't help)
+2. **Decision** — Full Phase 3.0 architecture refactor
+3. **Architecture Doc** — 1719 lines specification
+4. **Claude Code Round 1** — 48 files, 82 tests (35 min)
+5. **Integration Round** — WebSocket, progress, model selection (20 min)
+6. **Bug Fix Round** — Memory, workers, app close (15 min)
+7. **Claude Code Fix** — Dominant path early return, Overpass retry, progress (13 min)
+8. **Current** — Still timeout, need different approach
+
+---
+
+## 💡 Key Insights
+
+1. **Vectorization alone doesn't help** — problem is algorithmic, not just numpy
+2. **SharedMemory works** — terrain in shared memory is efficient
+3. **Building count matters** — 351k→15k filtering helps but not enough
+4. **dominant_path is the bottleneck** — consistently 200-300ms/point
+5. **Standard preset proves architecture works** — fast when less work needed
+
+---
+
+## 🔗 Related Documents
+
+- `/mnt/project/RFCP-Phase-3.0-Architecture-Refactor.md` — Full architecture spec
+- `/mnt/project/SESSION-2025-01-30-Iteration-10_1-Complete.md` — Previous session
+- `/mnt/transcripts/2026-02-01-19-06-32-phase-3.0-refactor-implementation-results.txt` — Detailed transcript
+
+---
+
+## 🎮 Side Project
+
+During this session, also designed **DF Diplomacy Expanded** mod:
+- Design doc: `DF-Diplomacy-Expanded-Design-Doc.md` (1202 lines)
+- MVP: War score, peace negotiation, tribute, reputation
+- Motto: *"Losing is fun, but sometimes you want to lose diplomatically."*
+
+---
+
+*"Standard preset works beautifully. Detailed preset needs love. The architecture is solid — now we optimize."*