Major refactoring of RFCP backend: - Modular propagation models (8 models) - SharedMemoryManager for terrain data - ProcessPoolExecutor parallel processing - WebSocket progress streaming - Building filtering pipeline (351k → 15k) - 82 unit tests Performance: Standard preset 38s → 5s (7.6x speedup) Known issue: Detailed preset timeout (fix in 3.1.0)
234 lines
6.8 KiB
Markdown
234 lines
6.8 KiB
Markdown
# RFCP Development Session Summary
|
||
## Date: February 1, 2025 (actually 2026)
|
||
## Status: Phase 3.0 Complete, Performance Optimization Ongoing
|
||
|
||
---
|
||
|
||
## 🎯 Project Overview
|
||
|
||
**RFCP (Radio Frequency Coverage Planning)** — desktop application for tactical LTE network planning, part of UMTC (Ukrainian Military Tactical Communications) project.
|
||
|
||
**Tech Stack:**
|
||
- Backend: Python/FastAPI + NumPy + ProcessPoolExecutor
|
||
- Frontend: React + TypeScript + Vite
|
||
- Desktop: Electron
|
||
- Build: PyInstaller (backend), electron-builder (desktop)
|
||
|
||
**Goal:** Calculate RF coverage maps with terrain, buildings, vegetation analysis.
|
||
|
||
---
|
||
|
||
## ✅ What Works (Phase 3.0 Achievements)
|
||
|
||
### Performance
|
||
| Preset | Before | After | Status |
|
||
|--------|--------|-------|--------|
|
||
| Standard (100-200m res) | 38s | **~5s** | ✅ EXCELLENT |
|
||
| Detailed (300m, 5km) | timeout | timeout | ❌ Still broken |
|
||
|
||
### Architecture (48 new files, 82 tests)
|
||
- ✅ Modular propagation models (8 models: FreeSpace, Okumura-Hata, COST-231, ITU-R P.1546, etc.)
|
||
- ✅ SharedMemoryManager for terrain data (zero-copy, 25 MB)
|
||
- ✅ Building filtering (351k → 27k bbox → 15k cap)
|
||
- ✅ WebSocket progress streaming (backend works)
|
||
- ✅ Clean model selection by frequency/environment
|
||
- ✅ Worker cleanup on shutdown
|
||
- ✅ Overpass API retry with failover (3 attempts, mirror endpoint)
|
||
|
||
### New Files Structure
|
||
```
|
||
backend/app/
|
||
├── propagation/ # 8 model files
|
||
├── geometry/ # 5 files (haversine, intersection, reflection, diffraction, los)
|
||
├── core/ # 4 files (engine, grid, calculator, result)
|
||
├── parallel/ # 3 files (manager, worker, pool)
|
||
├── services/ # cache.py, osm_client.py
|
||
├── utils/ # logging.py, progress.py, units.py
|
||
└── api/websocket.py
|
||
|
||
frontend/src/
|
||
├── hooks/useWebSocket.ts
|
||
├── services/websocket.ts
|
||
└── components/FrequencyBandPanel.tsx
|
||
```
|
||
|
||
---
|
||
|
||
## ❌ Current Blockers
|
||
|
||
### 1. Detailed Preset Timeout (CRITICAL)
|
||
|
||
**Symptom:** 300s timeout, only 194/868 points calculated
|
||
|
||
**Latest test results:**
|
||
```
|
||
[DOMINANT_PATH_VEC] Point #1: buildings=30, walls=214, dist=4887m
|
||
302.8ms/point × 868 points = 262 seconds
|
||
```
|
||
|
||
**Root Cause Analysis:**
|
||
- Early return fix (Claude Code) was for `buildings=[]` case
|
||
- But in reality, buildings ARE present (15,000 after cap)
|
||
- Each point finds 17-30 nearby buildings
|
||
- Each building has 100-295 wall segments
|
||
- **dominant_path_service** geometry calculations are expensive
|
||
|
||
**The real problem is NOT "buildings=0 is slow"**
|
||
**The real problem IS "dominant_path with buildings is inherently slow"**
|
||
|
||
**Potential solutions:**
|
||
1. Simplify building geometry (reduce wall count)
|
||
2. Use spatial indexing more aggressively
|
||
3. Skip dominant_path for distant points (>3km?)
|
||
4. Reduce building query radius
|
||
5. Use simpler path loss model when buildings present
|
||
6. GPU acceleration (CuPy) for geometry
|
||
|
||
### 2. Progress Bar Stuck at "Initializing 5%"
|
||
|
||
**Symptom:** UI shows "Initializing 5%" forever
|
||
|
||
**Fix attempted:** `await asyncio.sleep(0)` after progress_fn() — not working
|
||
|
||
**Likely cause:** Frontend WebSocket connection or state update issue
|
||
|
||
### 3. App Close Broken
|
||
|
||
**Symptom:** Clicking X kills backend but frontend stays open
|
||
|
||
**Partial fix:** Worker cleanup works, but Electron window doesn't close
|
||
|
||
### 4. Memory Not Released
|
||
|
||
**Symptom:** 1328 MB not freed after calculation
|
||
```
|
||
Before: 3904 MB free
|
||
After: 2576 MB free
|
||
```
|
||
|
||
---
|
||
|
||
## 📊 Performance Analysis
|
||
|
||
### Why Detailed is slow (the math):
|
||
|
||
```
|
||
Points: 868
|
||
Buildings nearby per point: ~25 average
|
||
Walls per building: ~150 average
|
||
Wall intersection checks: 868 × 25 × 150 = 3,255,000
|
||
|
||
At 0.1ms per check = 325 seconds
|
||
```
|
||
|
||
### Why Standard is fast:
|
||
|
||
- Lower resolution = fewer points (~200 vs 868)
|
||
- Likely skips some detailed calculations
|
||
- Buildings still processed but fewer points to check
|
||
|
||
---
|
||
|
||
## 🔧 Key Files to Review
|
||
|
||
### Backend (performance critical)
|
||
```
|
||
backend/app/services/
|
||
├── dominant_path_service.py # THE BOTTLENECK
|
||
├── coverage_service.py # Orchestration, progress
|
||
├── parallel_coverage_service.py # Worker management
|
||
└── buildings_service.py # OSM fetch, caching
|
||
```
|
||
|
||
### Frontend (UI bugs)
|
||
```
|
||
frontend/src/
|
||
├── App.tsx # Progress display
|
||
├── store/coverage.ts # WebSocket state
|
||
└── services/websocket.ts # WS connection
|
||
```
|
||
|
||
### Desktop (close bug)
|
||
```
|
||
desktop/main.js # Electron lifecycle
|
||
```
|
||
|
||
---
|
||
|
||
## 🎯 Recommended Next Steps
|
||
|
||
### Priority 1: Fix Detailed Performance
|
||
|
||
**Option A: Aggressive spatial filtering**
|
||
```python
|
||
# In dominant_path_service.py
|
||
# Only check buildings within line-of-sight corridor
|
||
# Not all buildings within radius
|
||
```
|
||
|
||
**Option B: LOD (Level of Detail)**
|
||
```python
|
||
# Distance > 2km: skip dominant path entirely
|
||
# Distance 1-2km: simplified model
|
||
# Distance < 1km: full calculation
|
||
```
|
||
|
||
**Option C: Building simplification**
|
||
```python
|
||
# Reduce wall count per building
|
||
# Merge adjacent buildings
|
||
# Use bounding boxes instead of polygons for far buildings
|
||
```
|
||
|
||
### Priority 2: Fix UI Bugs
|
||
- Debug WebSocket in browser DevTools
|
||
- Check Electron close handler
|
||
|
||
### Priority 3: Memory
|
||
- Explicit cleanup after calculation
|
||
- Check for leaked references
|
||
|
||
---
|
||
|
||
## 📝 Session Timeline
|
||
|
||
1. **Phase 2.4-2.5.1** — Vectorization attempt (didn't help)
|
||
2. **Decision** — Full Phase 3.0 architecture refactor
|
||
3. **Architecture Doc** — 1719 lines specification
|
||
4. **Claude Code Round 1** — 48 files, 82 tests (35 min)
|
||
5. **Integration Round** — WebSocket, progress, model selection (20 min)
|
||
6. **Bug Fix Round** — Memory, workers, app close (15 min)
|
||
7. **Claude Code Fix** — Dominant path early return, Overpass retry, progress (13 min)
|
||
8. **Current** — Still timeout, need different approach
|
||
|
||
---
|
||
|
||
## 💡 Key Insights
|
||
|
||
1. **Vectorization alone doesn't help** — problem is algorithmic, not just numpy
|
||
2. **SharedMemory works** — terrain in shared memory is efficient
|
||
3. **Building count matters** — 351k→15k filtering helps but not enough
|
||
4. **dominant_path is the bottleneck** — consistently 200-300ms/point
|
||
5. **Standard preset proves architecture works** — fast when less work needed
|
||
|
||
---
|
||
|
||
## 🔗 Related Documents
|
||
|
||
- `/mnt/project/RFCP-Phase-3.0-Architecture-Refactor.md` — Full architecture spec
|
||
- `/mnt/project/SESSION-2025-01-30-Iteration-10_1-Complete.md` — Previous session
|
||
- `/mnt/transcripts/2026-02-01-19-06-32-phase-3.0-refactor-implementation-results.txt` — Detailed transcript
|
||
|
||
---
|
||
|
||
## 🎮 Side Project
|
||
|
||
During this session, also designed **DF Diplomacy Expanded** mod:
|
||
- Design doc: `DF-Diplomacy-Expanded-Design-Doc.md` (1202 lines)
|
||
- MVP: War score, peace negotiation, tribute, reputation
|
||
- Motto: *"Losing is fun, but sometimes you want to lose diplomatically."*
|
||
|
||
---
|
||
|
||
*"Standard preset works beautifully. Detailed preset needs love. The architecture is solid — now we optimize."*
|