@mytec: iter3.7.0 start, gpu calc int

2026-02-03 22:41:08 +02:00
parent a61753c642
commit 6cd9d869cc
29 changed files with 2288 additions and 28 deletions
--- a/RFCP-3.7.0-GPU-Coverage-Task.md
+++ b/RFCP-3.7.0-GPU-Coverage-Task.md
@@ -0,0 +1,133 @@
+# RFCP 3.7.0 — GPU-Accelerated Coverage Calculations
+
+## Context
+
+Iteration 3.6.0 completed: CuPy-cuda13x works in production PyInstaller build,
+RTX 4060 detected, ONEDIR build with CUDA DLLs. BUT coverage calculations still
+run on CPU because coverage_service.py uses `import numpy as np` directly instead
+of the GPU backend.
+
+The GPU infrastructure is ready:
+- `app/services/gpu_backend.py` has `GPUManager.get_array_module()` → returns cupy or numpy
+- `/api/gpu/status` confirms `"active_backend": "cuda"` 
+- CuPy is imported and GPU detected in the frozen exe
+
+## Goal
+
+Replace direct `np.` calls in coverage_service.py with `xp = gpu_manager.get_array_module()`
+so calculations run on GPU when available, with automatic NumPy fallback.
+
+## Files to Modify
+
+### `app/services/coverage_service.py`
+
+**Line 7**: `import numpy as np` — keep this but also import gpu_manager
+
+Add near top:
+```python
+from app.services.gpu_backend import gpu_manager
+```
+
+**Key sections to GPU-accelerate** (highest impact first):
+
+#### 1. Grid array creation (lines 549-550, 922-923)
+```python
+# BEFORE:
+grid_lats = np.array([lat for lat, lon in grid])
+grid_lons = np.array([lon for lat, lon in grid])
+
+# AFTER:
+xp = gpu_manager.get_array_module()
+grid_lats = xp.array([lat for lat, lon in grid])
+grid_lons = xp.array([lon for lat, lon in grid])
+```
+
+#### 2. Trig calculations (line 468, 1031, 1408-1415, 1442)
+These use np.cos, np.radians, np.sin, np.degrees, np.arctan2 — all have CuPy equivalents.
+```python
+# BEFORE:
+lon_delta = settings.radius / (111000 * np.cos(np.radians(center_lat)))
+cos_lat = np.cos(np.radians(center_lat))
+
+# AFTER:
+xp = gpu_manager.get_array_module()
+lon_delta = settings.radius / (111000 * float(xp.cos(xp.radians(center_lat))))
+cos_lat = float(xp.cos(xp.radians(center_lat)))
+```
+
+#### 3. The heavy calculation loop — `_run_point_loop` (line 1070) and `_calculate_point_sync` (line 1112)
+This is where 90% of time is spent. Currently processes points one-by-one.
+The GPU win comes from vectorizing the path loss calculation across ALL grid points at once.
+
+**Strategy**: Instead of looping through points, create arrays of all distances/angles
+and compute path loss for all points in one vectorized operation.
+
+#### 4. `_calculate_bearing` (line 1402) — already vectorizable
+```python
+# All np.* functions here have direct CuPy equivalents
+# Just replace np → xp
+```
+
+## Important Rules
+
+1. **Always get xp at function scope**, not module scope:
+   ```python
+   def my_function(self, ...):
+       xp = gpu_manager.get_array_module()
+       # use xp instead of np
+   ```
+
+2. **Convert GPU arrays back to CPU** before returning to non-GPU code:
+   ```python
+   if hasattr(result, 'get'):  # CuPy array
+       result = result.get()  # → numpy array
+   ```
+
+3. **Keep np for small/scalar operations** — GPU overhead isn't worth it for single values.
+   Only use xp for array operations on 100+ elements.
+
+4. **Don't break the fallback** — if CuPy isn't available, `get_array_module()` returns numpy,
+   so `xp.array()` etc. work identically.
+
+5. **Test both paths** — run with GPU and verify same results as CPU.
+
+## Testing
+
+After changes:
+```powershell
+# Rebuild
+cd D:\root\rfcp\backend
+pyinstaller ..\installer\rfcp-server-gpu.spec --noconfirm
+
+# Run
+.\dist\rfcp-server\rfcp-server.exe
+
+# Test calculation via frontend — watch Task Manager GPU utilization
+# Should see GPU Compute spike during coverage calculation
+# Time should be significantly faster than 10s for 1254 points
+```
+
+Compare before/after:
+- Current (CPU): ~10s for 1254 points, 5km radius
+- Expected (GPU): 1-3s for same calculation
+
+Also test GPU diagnostics:
+```
+curl http://localhost:8888/api/gpu/diagnostics
+```
+
+## What NOT to Change
+
+- Don't modify gpu_backend.py — it's working correctly
+- Don't change the API endpoints or response format
+- Don't remove the NumPy import — keep it for non-array operations
+- Don't change propagation model math — only the array operations
+- Don't change _filter_buildings_to_bbox or OSM functions — they use lists not arrays
+
+## Success Criteria
+
+- [ ] Coverage calculation uses GPU (visible in Task Manager)
+- [ ] Calculation time reduced for 1000+ point grids
+- [ ] CPU fallback still works (test by setting active_backend to cpu via API)
+- [ ] Same coverage results (heatmap should look identical)
+- [ ] No regression in tiled processing mode