rfcp/docs/devlog/gpu_supp/RFCP-Iteration-3.6.0-Production-GPU-Build.md

# RFCP — Iteration 3.6.0: Production GPU Build

## Overview

Enable GPU acceleration in the production PyInstaller build. Currently production
runs CPU-only (NumPy) because CuPy is not included in rfcp-server.exe.

**Goal:** User with NVIDIA GPU installs RFCP → GPU detected automatically →
coverage calculations use CUDA acceleration. No manual pip install required.

**Context from diagnostics screenshot:**
```json
{
  "python_executable": "C:\\Users\\Administrator\\AppData\\Local\\Programs\\RFCP\\resources\\backend\\rfcp-server.exe",
  "platform": "Windows-10-10.0.26288-SP0",
  "is_wsl": false,
  "numpy": { "version": "1.26.4" },
  "cuda": {
    "error": "CuPy not installed",
    "install_hint": "pip install cupy-cuda12x"
  }
}
```

**Architecture:** Production uses PyInstaller-bundled rfcp-server.exe (self-contained).
CuPy not included → GPU not available for end users.

---

## Strategy: Two-Tier Build

Instead of one massive binary, produce two builds:

```
RFCP-Setup-{version}.exe          (~150 MB) — CPU-only, works everywhere
RFCP-Setup-{version}-GPU.exe      (~700 MB) — includes CuPy + CUDA runtime
```

**Why not dynamic loading?**
PyInstaller bundles everything at build time. CuPy can't be pip-installed
into a frozen exe at runtime. Options are:

1. **Bundle CuPy in PyInstaller** ← cleanest, what we'll do
2. Side-load CuPy DLLs (fragile, version-sensitive)
3. Hybrid: unfrozen Python + CuPy installed separately (defeats purpose of exe)

---

## Task 1: PyInstaller Spec with CuPy (Priority 1 — 30 min)

### File: `installer/rfcp-server-gpu.spec`

Create a separate .spec file that includes CuPy:

```python
# rfcp-server-gpu.spec — GPU-enabled build
import os
import sys
from PyInstaller.utils.hooks import collect_all, collect_dynamic_libs

backend_path = os.path.abspath(os.path.join(os.path.dirname(SPEC), '..', 'backend'))

# Collect CuPy and its CUDA dependencies
cupy_datas, cupy_binaries, cupy_hiddenimports = collect_all('cupy')
# Also collect cupy_backends
cupyb_datas, cupyb_binaries, cupyb_hiddenimports = collect_all('cupy_backends')

# CUDA runtime libraries that CuPy needs
cuda_binaries = collect_dynamic_libs('cupy')

a = Analysis(
    [os.path.join(backend_path, 'run_server.py')],
    pathex=[backend_path],
    binaries=cupy_binaries + cupyb_binaries + cuda_binaries,
    datas=[
        (os.path.join(backend_path, 'data', 'terrain'), 'data/terrain'),
    ] + cupy_datas + cupyb_datas,
    hiddenimports=[
        # Existing imports from rfcp-server.spec
        'uvicorn.logging',
        'uvicorn.loops',
        'uvicorn.loops.auto',
        'uvicorn.protocols',
        'uvicorn.protocols.http',
        'uvicorn.protocols.http.auto',
        'uvicorn.protocols.websockets',
        'uvicorn.protocols.websockets.auto',
        'uvicorn.lifespan',
        'uvicorn.lifespan.on',
        'motor',
        'pymongo',
        'numpy',
        'scipy',
        'shapely',
        'shapely.geometry',
        'shapely.ops',
        # CuPy-specific
        'cupy',
        'cupy.cuda',
        'cupy.cuda.runtime',
        'cupy.cuda.driver',
        'cupy.cuda.memory',
        'cupy.cuda.stream',
        'cupy._core',
        'cupy._core.core',
        'cupy._core._routines_math',
        'cupy.fft',
        'cupy.linalg',
        'fastrlock',
    ] + cupy_hiddenimports + cupyb_hiddenimports,
    hookspath=[],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    noarchive=False,
)

pyz = PYZ(a.pure)

exe = EXE(
    pyz,
    a.scripts,
    a.binaries,
    a.datas,
    [],
    name='rfcp-server',
    debug=False,
    bootloader_ignore_signals=False,
    strip=False,
    upx=False,  # Don't compress CUDA libs — they need fast loading
    console=True,
    icon=os.path.join(os.path.dirname(SPEC), 'rfcp.ico'),
)
```

### Key Points:
- `collect_all('cupy')` grabs all CuPy submodules + CUDA DLLs
- `fastrlock` is a CuPy dependency (must be in hiddenimports)
- `upx=False` — don't compress CUDA binaries (breaks them)
- One-file mode (`a.binaries + a.datas` in EXE) for single exe

---

## Task 2: Build Script for GPU Variant (Priority 1 — 15 min)

### File: `installer/build-gpu.bat` (Windows)

```batch
@echo off
echo ========================================
echo  RFCP GPU Build — rfcp-server-gpu.exe
echo ========================================

REM Ensure CuPy is installed in build environment
echo Checking CuPy installation...
python -c "import cupy; print(f'CuPy {cupy.__version__} with CUDA {cupy.cuda.runtime.runtimeGetVersion()}')"
if errorlevel 1 (
    echo ERROR: CuPy not installed. Run: pip install cupy-cuda12x
    exit /b 1
)

REM Build with GPU spec
echo Building rfcp-server with GPU support...
cd /d %~dp0\..\backend
pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm

echo.
echo Build complete! Output: dist\rfcp-server.exe
echo Size:
dir dist\rfcp-server.exe

REM Optional: copy to Electron resources
if exist "..\desktop\resources" (
    copy /y dist\rfcp-server.exe ..\desktop\resources\rfcp-server.exe
    echo Copied to desktop\resources\
)

pause
```

### File: `installer/build-gpu.sh` (WSL/Linux)

```bash
#!/bin/bash
set -e

echo "========================================"
echo " RFCP GPU Build — rfcp-server (GPU)"
echo "========================================"

# Check CuPy
python3 -c "import cupy; print(f'CuPy {cupy.__version__}')" 2>/dev/null || {
    echo "ERROR: CuPy not installed. Run: pip install cupy-cuda12x"
    exit 1
}

cd "$(dirname "$0")/../backend"
pyinstaller ../installer/rfcp-server-gpu.spec --clean --noconfirm

echo ""
echo "Build complete!"
ls -lh dist/rfcp-server*
```

---

## Task 3: GPU Backend — Graceful CuPy Detection (Priority 1 — 15 min)

### File: `backend/app/services/gpu_backend.py`

The existing gpu_backend.py should already handle CuPy absence gracefully.
Verify and fix if needed:

```python
# gpu_backend.py — must work in BOTH CPU and GPU builds

import numpy as np

# Try importing CuPy — this is the key detection
_cupy_available = False
_gpu_device_name = None
_gpu_memory_mb = 0

try:
    import cupy as cp
    # Verify we can actually use it (not just import)
    device = cp.cuda.Device(0)
    _gpu_device_name = device.attributes.get('name', f'CUDA Device {device.id}')
    # Try to get name via runtime
    try:
        props = cp.cuda.runtime.getDeviceProperties(0)
        _gpu_device_name = props.get('name', _gpu_device_name)
        if isinstance(_gpu_device_name, bytes):
            _gpu_device_name = _gpu_device_name.decode('utf-8').strip('\x00')
    except Exception:
        pass
    _gpu_memory_mb = device.mem_info[1] // (1024 * 1024)
    _cupy_available = True
except ImportError:
    cp = None  # CuPy not installed (CPU build)
except Exception as e:
    cp = None  # CuPy installed but CUDA not available
    print(f"[GPU] CuPy found but CUDA unavailable: {e}")


def is_gpu_available() -> bool:
    return _cupy_available

def get_gpu_info() -> dict:
    if _cupy_available:
        return {
            "available": True,
            "backend": "CuPy (CUDA)",
            "device": _gpu_device_name,
            "memory_mb": _gpu_memory_mb,
        }
    return {
        "available": False,
        "backend": "NumPy (CPU)",
        "device": "CPU",
        "memory_mb": 0,
    }

def get_array_module():
    """Return cupy if available, otherwise numpy."""
    if _cupy_available:
        return cp
    return np
```

### Usage in coverage_service.py:

```python
from app.services.gpu_backend import get_array_module, is_gpu_available

xp = get_array_module()  # cupy or numpy — same API

# All calculations use xp instead of np:
distances = xp.sqrt(dx**2 + dy**2)
path_loss = 20 * xp.log10(distances) + 20 * xp.log10(freq_mhz) - 27.55

# If using cupy, results need to come back to CPU for JSON serialization:
if is_gpu_available():
    results = xp.asnumpy(path_loss)
else:
    results = path_loss
```

---

## Task 4: GPU Status in Frontend Header (Priority 2 — 10 min)

### Update GPUIndicator.tsx

When GPU is detected, the badge should clearly show it:

```
CPU build:     [⚙ CPU]          (gray badge)
GPU detected:  [⚡ RTX 4060]     (green badge)
```

The existing GPUIndicator already does this. Just verify:
1. Badge color changes from gray → green when GPU available
2. Dropdown shows "Active: GPU (CUDA)" not just "CPU (NumPy)"
3. No install hints shown when CuPy IS available

---

## Task 5: Build Environment Setup (Priority 1 — Manual by Олег)

### Prerequisites for GPU build:

```powershell
# 1. Install CuPy in Windows Python (NOT WSL)
pip install cupy-cuda12x

# 2. Verify CuPy works
python -c "import cupy; print(cupy.cuda.runtime.runtimeGetVersion())"
# Should print: 12000 or similar

# 3. Install PyInstaller if not present
pip install pyinstaller

# 4. Verify fastrlock (CuPy dependency)
pip install fastrlock
```

### Build commands:

```powershell
# CPU-only build (existing)
cd D:\root\rfcp\backend
pyinstaller ..\installer\rfcp-server.spec --clean --noconfirm

# GPU build (new)
cd D:\root\rfcp\backend
pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm
```

### Expected output sizes:
```
rfcp-server.exe (CPU):  ~80 MB
rfcp-server.exe (GPU):  ~600-800 MB  (CuPy bundles CUDA runtime libs)
```

---

## Task 6: Electron — Detect Build Variant (Priority 2 — 10 min)

### File: `desktop/main.js` or `desktop/src/main.ts`

Add version detection so UI knows which build it's running:

```javascript
// After backend starts, check GPU status
async function checkBackendCapabilities() {
  try {
    const response = await fetch('http://127.0.0.1:8090/api/gpu/status');
    const data = await response.json();

    // Send to renderer
    mainWindow.webContents.send('gpu-status', data);

    if (data.available) {
      console.log(`[RFCP] GPU: ${data.device} (${data.memory_mb} MB)`);
    } else {
      console.log('[RFCP] Running in CPU mode');
    }
  } catch (e) {
    console.log('[RFCP] Backend not ready for GPU check');
  }
}
```

---

## Task 7: About / Version Info (Priority 3 — 5 min)

### Add build info to `/api/health` response:

```python
@app.get("/api/health")
async def health():
    gpu_info = get_gpu_info()
    return {
        "status": "ok",
        "version": "3.6.0",
        "build": "gpu" if gpu_info["available"] else "cpu",
        "gpu": gpu_info,
        "python": sys.version,
        "platform": platform.platform(),
    }
```

---

## Build & Test Procedure

### Step 1: Setup Build Environment
```powershell
# Windows PowerShell (NOT WSL)
cd D:\root\rfcp

# Verify Python environment
python --version       # Should be 3.11.x
pip list | findstr cupy  # Should show cupy-cuda12x

# If CuPy not installed:
pip install cupy-cuda12x fastrlock
```

### Step 2: Build GPU Variant
```powershell
cd D:\root\rfcp\backend
pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm
```

### Step 3: Test Standalone
```powershell
# Run the built exe directly
.\dist\rfcp-server.exe

# In another terminal:
curl http://localhost:8090/api/health
curl http://localhost:8090/api/gpu/status
curl http://localhost:8090/api/gpu/diagnostics
```

### Step 4: Verify GPU Detection
Expected `/api/gpu/status` response:
```json
{
  "available": true,
  "backend": "CuPy (CUDA)",
  "device": "NVIDIA GeForce RTX 4060 Laptop GPU",
  "memory_mb": 8188
}
```

### Step 5: Run Coverage Calculation
- Place a site on map
- Calculate coverage (10km, 200m resolution)
- Check logs for: `[GPU] Using CUDA: RTX 4060 (8188 MB)`
- Compare performance: should be 5-10x faster than CPU

### Step 6: Full Electron Build
```powershell
# Copy GPU server to Electron resources
copy backend\dist\rfcp-server.exe desktop\resources\

# Build Electron installer
cd installer
.\build-win.sh  # or equivalent Windows script
```

---

## Risk Assessment

### Size Concern
CuPy bundles CUDA runtime (~500MB). Total GPU installer ~700-800MB.
**Mitigation:** This is acceptable for a professional RF planning tool.
AutoCAD is 7GB. QGIS is 1.5GB. Atoll is 3GB+.

### CUDA Version Compatibility
CuPy-cuda12x requires CUDA 12.x compatible driver.
RTX 4060 with Driver 581.42 → CUDA 13.0 → backward compatible ✅
**Mitigation:** gpu_backend.py already falls back to NumPy gracefully.

### PyInstaller + CuPy Issues
Known issues:
- CuPy uses many .so/.dll files that PyInstaller might miss
- `collect_all('cupy')` should catch them, but test thoroughly
- If missing DLLs → add them manually to `binaries` list

**Mitigation:** Test the standalone exe on a clean machine (no Python installed).

### Antivirus False Positives
Larger exe = more AV suspicion. PyInstaller exes already trigger some AV.
**Mitigation:** Code-sign the exe (future task), submit to AV vendors for whitelisting.

---

## Success Criteria

- [ ] `rfcp-server-gpu.spec` created and builds successfully
- [ ] Built exe detects RTX 4060 on startup
- [ ] `/api/gpu/status` returns `"available": true`
- [ ] Coverage calculation uses CuPy (check logs)
- [ ] GPU badge shows "⚡ RTX 4060" (green) in header
- [ ] Fallback to NumPy works if CUDA unavailable
- [ ] CPU-only spec (`rfcp-server.spec`) still builds and works
- [ ] Build time < 10 minutes
- [ ] GPU exe size < 1 GB

---

## Commit Message

```
feat(build): add GPU-enabled PyInstaller build with CuPy + CUDA

- New rfcp-server-gpu.spec with CuPy/CUDA collection
- Build scripts: build-gpu.bat, build-gpu.sh
- Graceful GPU detection in gpu_backend.py
- Two-tier build: CPU (~80MB) and GPU (~700MB) variants
- Auto-detection: RTX 4060 → CuPy acceleration
- Fallback: no CUDA → NumPy (CPU mode)

Iteration 3.6.0 — Production GPU Build
```

---

## Files Summary

### New Files:
| File | Purpose |
|------|---------|
| `installer/rfcp-server-gpu.spec` | PyInstaller config with CuPy |
| `installer/build-gpu.bat` | Windows GPU build script |
| `installer/build-gpu.sh` | Linux/WSL GPU build script |

### Modified Files:
| File | Changes |
|------|---------|
| `backend/app/services/gpu_backend.py` | Verify graceful detection |
| `backend/app/main.py` | Health endpoint with build info |
| `desktop/main.js` or `main.ts` | GPU status check after backend start |
| `frontend/src/components/ui/GPUIndicator.tsx` | Verify badge shows GPU |

### No Changes Needed:
| File | Reason |
|------|--------|
| `installer/rfcp-server.spec` | CPU build stays as-is |
| `backend/app/services/coverage_service.py` | Already uses get_array_module() |
| `installer/build-win.sh` | Existing CPU build unchanged |

---

## Timeline

| Phase | Task | Time |
|-------|------|------|
| **P1** | Create rfcp-server-gpu.spec | 30 min |
| **P1** | Build scripts | 15 min |
| **P1** | Verify gpu_backend.py | 15 min |
| **P2** | Frontend badge verification | 10 min |
| **P2** | Electron GPU status | 10 min |
| **P3** | Health endpoint update | 5 min |
| **Test** | Build + test standalone | 20 min |
| **Test** | Full Electron build | 15 min |
| | **Total** | **~2 hours** |

**Claude Code estimated time: 10-15 min** (spec + scripts + backend changes)
**Manual testing by Олег: 30-45 min** (building + verifying)