mytec/rfcp

Files

mytec 6cd9d869cc @mytec : iter3.7.0 start, gpu calc int

2026-02-03 22:41:08 +02:00

4.3 KiB

Raw Blame History

RFCP 3.7.0 — GPU-Accelerated Coverage Calculations

Context

Iteration 3.6.0 completed: CuPy-cuda13x works in production PyInstaller build, RTX 4060 detected, ONEDIR build with CUDA DLLs. BUT coverage calculations still run on CPU because coverage_service.py uses import numpy as np directly instead of the GPU backend.

The GPU infrastructure is ready:

app/services/gpu_backend.py has GPUManager.get_array_module() → returns cupy or numpy
/api/gpu/status confirms "active_backend": "cuda"
CuPy is imported and GPU detected in the frozen exe

Goal

Replace direct np. calls in coverage_service.py with xp = gpu_manager.get_array_module() so calculations run on GPU when available, with automatic NumPy fallback.

Files to Modify

`app/services/coverage_service.py`

Line 7: import numpy as np — keep this but also import gpu_manager

Add near top:

from app.services.gpu_backend import gpu_manager

Key sections to GPU-accelerate (highest impact first):

1. Grid array creation (lines 549-550, 922-923)

# BEFORE:
grid_lats = np.array([lat for lat, lon in grid])
grid_lons = np.array([lon for lat, lon in grid])

# AFTER:
xp = gpu_manager.get_array_module()
grid_lats = xp.array([lat for lat, lon in grid])
grid_lons = xp.array([lon for lat, lon in grid])

2. Trig calculations (line 468, 1031, 1408-1415, 1442)

These use np.cos, np.radians, np.sin, np.degrees, np.arctan2 — all have CuPy equivalents.

# BEFORE:
lon_delta = settings.radius / (111000 * np.cos(np.radians(center_lat)))
cos_lat = np.cos(np.radians(center_lat))

# AFTER:
xp = gpu_manager.get_array_module()
lon_delta = settings.radius / (111000 * float(xp.cos(xp.radians(center_lat))))
cos_lat = float(xp.cos(xp.radians(center_lat)))

3. The heavy calculation loop — `_run_point_loop` (line 1070) and `_calculate_point_sync` (line 1112)

This is where 90% of time is spent. Currently processes points one-by-one. The GPU win comes from vectorizing the path loss calculation across ALL grid points at once.

Strategy: Instead of looping through points, create arrays of all distances/angles and compute path loss for all points in one vectorized operation.

4. `_calculate_bearing` (line 1402) — already vectorizable

# All np.* functions here have direct CuPy equivalents
# Just replace np → xp

Important Rules

Always get xp at function scope, not module scope:

def my_function(self, ...):
    xp = gpu_manager.get_array_module()
    # use xp instead of np

Convert GPU arrays back to CPU before returning to non-GPU code:

if hasattr(result, 'get'):  # CuPy array
    result = result.get()  # → numpy array

Keep np for small/scalar operations — GPU overhead isn't worth it for single values. Only use xp for array operations on 100+ elements.
Don't break the fallback — if CuPy isn't available, get_array_module() returns numpy, so xp.array() etc. work identically.
Test both paths — run with GPU and verify same results as CPU.

Testing

After changes:

# Rebuild
cd D:\root\rfcp\backend
pyinstaller ..\installer\rfcp-server-gpu.spec --noconfirm

# Run
.\dist\rfcp-server\rfcp-server.exe

# Test calculation via frontend — watch Task Manager GPU utilization
# Should see GPU Compute spike during coverage calculation
# Time should be significantly faster than 10s for 1254 points

Compare before/after:

Current (CPU): ~10s for 1254 points, 5km radius
Expected (GPU): 1-3s for same calculation

Also test GPU diagnostics:

curl http://localhost:8888/api/gpu/diagnostics

What NOT to Change

Don't modify gpu_backend.py — it's working correctly
Don't change the API endpoints or response format
Don't remove the NumPy import — keep it for non-array operations
Don't change propagation model math — only the array operations
Don't change _filter_buildings_to_bbox or OSM functions — they use lists not arrays

Success Criteria

Coverage calculation uses GPU (visible in Task Manager)
Calculation time reduced for 1000+ point grids
CPU fallback still works (test by setting active_backend to cpu via API)
Same coverage results (heatmap should look identical)
No regression in tiled processing mode

4.3 KiB Raw Blame History