Files
rfcp/RFCP-Iteration-3.6.0-Production-GPU-Build.md

15 KiB

RFCP — Iteration 3.6.0: Production GPU Build

Overview

Enable GPU acceleration in the production PyInstaller build. Currently production runs CPU-only (NumPy) because CuPy is not included in rfcp-server.exe.

Goal: User with NVIDIA GPU installs RFCP → GPU detected automatically → coverage calculations use CUDA acceleration. No manual pip install required.

Context from diagnostics screenshot:

{
  "python_executable": "C:\\Users\\Administrator\\AppData\\Local\\Programs\\RFCP\\resources\\backend\\rfcp-server.exe",
  "platform": "Windows-10-10.0.26288-SP0",
  "is_wsl": false,
  "numpy": { "version": "1.26.4" },
  "cuda": {
    "error": "CuPy not installed",
    "install_hint": "pip install cupy-cuda12x"
  }
}

Architecture: Production uses PyInstaller-bundled rfcp-server.exe (self-contained). CuPy not included → GPU not available for end users.


Strategy: Two-Tier Build

Instead of one massive binary, produce two builds:

RFCP-Setup-{version}.exe          (~150 MB) — CPU-only, works everywhere
RFCP-Setup-{version}-GPU.exe      (~700 MB) — includes CuPy + CUDA runtime

Why not dynamic loading? PyInstaller bundles everything at build time. CuPy can't be pip-installed into a frozen exe at runtime. Options are:

  1. Bundle CuPy in PyInstaller ← cleanest, what we'll do
  2. Side-load CuPy DLLs (fragile, version-sensitive)
  3. Hybrid: unfrozen Python + CuPy installed separately (defeats purpose of exe)

Task 1: PyInstaller Spec with CuPy (Priority 1 — 30 min)

File: installer/rfcp-server-gpu.spec

Create a separate .spec file that includes CuPy:

# rfcp-server-gpu.spec — GPU-enabled build
import os
import sys
from PyInstaller.utils.hooks import collect_all, collect_dynamic_libs

backend_path = os.path.abspath(os.path.join(os.path.dirname(SPEC), '..', 'backend'))

# Collect CuPy and its CUDA dependencies
cupy_datas, cupy_binaries, cupy_hiddenimports = collect_all('cupy')
# Also collect cupy_backends
cupyb_datas, cupyb_binaries, cupyb_hiddenimports = collect_all('cupy_backends')

# CUDA runtime libraries that CuPy needs
cuda_binaries = collect_dynamic_libs('cupy')

a = Analysis(
    [os.path.join(backend_path, 'run_server.py')],
    pathex=[backend_path],
    binaries=cupy_binaries + cupyb_binaries + cuda_binaries,
    datas=[
        (os.path.join(backend_path, 'data', 'terrain'), 'data/terrain'),
    ] + cupy_datas + cupyb_datas,
    hiddenimports=[
        # Existing imports from rfcp-server.spec
        'uvicorn.logging',
        'uvicorn.loops',
        'uvicorn.loops.auto',
        'uvicorn.protocols',
        'uvicorn.protocols.http',
        'uvicorn.protocols.http.auto',
        'uvicorn.protocols.websockets',
        'uvicorn.protocols.websockets.auto',
        'uvicorn.lifespan',
        'uvicorn.lifespan.on',
        'motor',
        'pymongo',
        'numpy',
        'scipy',
        'shapely',
        'shapely.geometry',
        'shapely.ops',
        # CuPy-specific
        'cupy',
        'cupy.cuda',
        'cupy.cuda.runtime',
        'cupy.cuda.driver',
        'cupy.cuda.memory',
        'cupy.cuda.stream',
        'cupy._core',
        'cupy._core.core',
        'cupy._core._routines_math',
        'cupy.fft',
        'cupy.linalg',
        'fastrlock',
    ] + cupy_hiddenimports + cupyb_hiddenimports,
    hookspath=[],
    hooksconfig={},
    runtime_hooks=[],
    excludes=[],
    noarchive=False,
)

pyz = PYZ(a.pure)

exe = EXE(
    pyz,
    a.scripts,
    a.binaries,
    a.datas,
    [],
    name='rfcp-server',
    debug=False,
    bootloader_ignore_signals=False,
    strip=False,
    upx=False,  # Don't compress CUDA libs — they need fast loading
    console=True,
    icon=os.path.join(os.path.dirname(SPEC), 'rfcp.ico'),
)

Key Points:

  • collect_all('cupy') grabs all CuPy submodules + CUDA DLLs
  • fastrlock is a CuPy dependency (must be in hiddenimports)
  • upx=False — don't compress CUDA binaries (breaks them)
  • One-file mode (a.binaries + a.datas in EXE) for single exe

Task 2: Build Script for GPU Variant (Priority 1 — 15 min)

File: installer/build-gpu.bat (Windows)

@echo off
echo ========================================
echo  RFCP GPU Build — rfcp-server-gpu.exe
echo ========================================

REM Ensure CuPy is installed in build environment
echo Checking CuPy installation...
python -c "import cupy; print(f'CuPy {cupy.__version__} with CUDA {cupy.cuda.runtime.runtimeGetVersion()}')"
if errorlevel 1 (
    echo ERROR: CuPy not installed. Run: pip install cupy-cuda12x
    exit /b 1
)

REM Build with GPU spec
echo Building rfcp-server with GPU support...
cd /d %~dp0\..\backend
pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm

echo.
echo Build complete! Output: dist\rfcp-server.exe
echo Size:
dir dist\rfcp-server.exe

REM Optional: copy to Electron resources
if exist "..\desktop\resources" (
    copy /y dist\rfcp-server.exe ..\desktop\resources\rfcp-server.exe
    echo Copied to desktop\resources\
)

pause

File: installer/build-gpu.sh (WSL/Linux)

#!/bin/bash
set -e

echo "========================================"
echo " RFCP GPU Build — rfcp-server (GPU)"
echo "========================================"

# Check CuPy
python3 -c "import cupy; print(f'CuPy {cupy.__version__}')" 2>/dev/null || {
    echo "ERROR: CuPy not installed. Run: pip install cupy-cuda12x"
    exit 1
}

cd "$(dirname "$0")/../backend"
pyinstaller ../installer/rfcp-server-gpu.spec --clean --noconfirm

echo ""
echo "Build complete!"
ls -lh dist/rfcp-server*

Task 3: GPU Backend — Graceful CuPy Detection (Priority 1 — 15 min)

File: backend/app/services/gpu_backend.py

The existing gpu_backend.py should already handle CuPy absence gracefully. Verify and fix if needed:

# gpu_backend.py — must work in BOTH CPU and GPU builds

import numpy as np

# Try importing CuPy — this is the key detection
_cupy_available = False
_gpu_device_name = None
_gpu_memory_mb = 0

try:
    import cupy as cp
    # Verify we can actually use it (not just import)
    device = cp.cuda.Device(0)
    _gpu_device_name = device.attributes.get('name', f'CUDA Device {device.id}')
    # Try to get name via runtime
    try:
        props = cp.cuda.runtime.getDeviceProperties(0)
        _gpu_device_name = props.get('name', _gpu_device_name)
        if isinstance(_gpu_device_name, bytes):
            _gpu_device_name = _gpu_device_name.decode('utf-8').strip('\x00')
    except Exception:
        pass
    _gpu_memory_mb = device.mem_info[1] // (1024 * 1024)
    _cupy_available = True
except ImportError:
    cp = None  # CuPy not installed (CPU build)
except Exception as e:
    cp = None  # CuPy installed but CUDA not available
    print(f"[GPU] CuPy found but CUDA unavailable: {e}")


def is_gpu_available() -> bool:
    return _cupy_available

def get_gpu_info() -> dict:
    if _cupy_available:
        return {
            "available": True,
            "backend": "CuPy (CUDA)",
            "device": _gpu_device_name,
            "memory_mb": _gpu_memory_mb,
        }
    return {
        "available": False,
        "backend": "NumPy (CPU)",
        "device": "CPU",
        "memory_mb": 0,
    }

def get_array_module():
    """Return cupy if available, otherwise numpy."""
    if _cupy_available:
        return cp
    return np

Usage in coverage_service.py:

from app.services.gpu_backend import get_array_module, is_gpu_available

xp = get_array_module()  # cupy or numpy — same API

# All calculations use xp instead of np:
distances = xp.sqrt(dx**2 + dy**2)
path_loss = 20 * xp.log10(distances) + 20 * xp.log10(freq_mhz) - 27.55

# If using cupy, results need to come back to CPU for JSON serialization:
if is_gpu_available():
    results = xp.asnumpy(path_loss)
else:
    results = path_loss

Task 4: GPU Status in Frontend Header (Priority 2 — 10 min)

Update GPUIndicator.tsx

When GPU is detected, the badge should clearly show it:

CPU build:     [⚙ CPU]          (gray badge)
GPU detected:  [⚡ RTX 4060]     (green badge)

The existing GPUIndicator already does this. Just verify:

  1. Badge color changes from gray → green when GPU available
  2. Dropdown shows "Active: GPU (CUDA)" not just "CPU (NumPy)"
  3. No install hints shown when CuPy IS available

Task 5: Build Environment Setup (Priority 1 — Manual by Олег)

Prerequisites for GPU build:

# 1. Install CuPy in Windows Python (NOT WSL)
pip install cupy-cuda12x

# 2. Verify CuPy works
python -c "import cupy; print(cupy.cuda.runtime.runtimeGetVersion())"
# Should print: 12000 or similar

# 3. Install PyInstaller if not present
pip install pyinstaller

# 4. Verify fastrlock (CuPy dependency)
pip install fastrlock

Build commands:

# CPU-only build (existing)
cd D:\root\rfcp\backend
pyinstaller ..\installer\rfcp-server.spec --clean --noconfirm

# GPU build (new)
cd D:\root\rfcp\backend  
pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm

Expected output sizes:

rfcp-server.exe (CPU):  ~80 MB
rfcp-server.exe (GPU):  ~600-800 MB  (CuPy bundles CUDA runtime libs)

Task 6: Electron — Detect Build Variant (Priority 2 — 10 min)

File: desktop/main.js or desktop/src/main.ts

Add version detection so UI knows which build it's running:

// After backend starts, check GPU status
async function checkBackendCapabilities() {
  try {
    const response = await fetch('http://127.0.0.1:8090/api/gpu/status');
    const data = await response.json();
    
    // Send to renderer
    mainWindow.webContents.send('gpu-status', data);
    
    if (data.available) {
      console.log(`[RFCP] GPU: ${data.device} (${data.memory_mb} MB)`);
    } else {
      console.log('[RFCP] Running in CPU mode');
    }
  } catch (e) {
    console.log('[RFCP] Backend not ready for GPU check');
  }
}

Task 7: About / Version Info (Priority 3 — 5 min)

Add build info to /api/health response:

@app.get("/api/health")
async def health():
    gpu_info = get_gpu_info()
    return {
        "status": "ok",
        "version": "3.6.0",
        "build": "gpu" if gpu_info["available"] else "cpu",
        "gpu": gpu_info,
        "python": sys.version,
        "platform": platform.platform(),
    }

Build & Test Procedure

Step 1: Setup Build Environment

# Windows PowerShell (NOT WSL)
cd D:\root\rfcp

# Verify Python environment  
python --version       # Should be 3.11.x
pip list | findstr cupy  # Should show cupy-cuda12x

# If CuPy not installed:
pip install cupy-cuda12x fastrlock

Step 2: Build GPU Variant

cd D:\root\rfcp\backend
pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm

Step 3: Test Standalone

# Run the built exe directly
.\dist\rfcp-server.exe

# In another terminal:
curl http://localhost:8090/api/health
curl http://localhost:8090/api/gpu/status
curl http://localhost:8090/api/gpu/diagnostics

Step 4: Verify GPU Detection

Expected /api/gpu/status response:

{
  "available": true,
  "backend": "CuPy (CUDA)",
  "device": "NVIDIA GeForce RTX 4060 Laptop GPU",
  "memory_mb": 8188
}

Step 5: Run Coverage Calculation

  • Place a site on map
  • Calculate coverage (10km, 200m resolution)
  • Check logs for: [GPU] Using CUDA: RTX 4060 (8188 MB)
  • Compare performance: should be 5-10x faster than CPU

Step 6: Full Electron Build

# Copy GPU server to Electron resources
copy backend\dist\rfcp-server.exe desktop\resources\

# Build Electron installer
cd installer
.\build-win.sh  # or equivalent Windows script

Risk Assessment

Size Concern

CuPy bundles CUDA runtime (~500MB). Total GPU installer ~700-800MB. Mitigation: This is acceptable for a professional RF planning tool. AutoCAD is 7GB. QGIS is 1.5GB. Atoll is 3GB+.

CUDA Version Compatibility

CuPy-cuda12x requires CUDA 12.x compatible driver. RTX 4060 with Driver 581.42 → CUDA 13.0 → backward compatible Mitigation: gpu_backend.py already falls back to NumPy gracefully.

PyInstaller + CuPy Issues

Known issues:

  • CuPy uses many .so/.dll files that PyInstaller might miss
  • collect_all('cupy') should catch them, but test thoroughly
  • If missing DLLs → add them manually to binaries list

Mitigation: Test the standalone exe on a clean machine (no Python installed).

Antivirus False Positives

Larger exe = more AV suspicion. PyInstaller exes already trigger some AV. Mitigation: Code-sign the exe (future task), submit to AV vendors for whitelisting.


Success Criteria

  • rfcp-server-gpu.spec created and builds successfully
  • Built exe detects RTX 4060 on startup
  • /api/gpu/status returns "available": true
  • Coverage calculation uses CuPy (check logs)
  • GPU badge shows " RTX 4060" (green) in header
  • Fallback to NumPy works if CUDA unavailable
  • CPU-only spec (rfcp-server.spec) still builds and works
  • Build time < 10 minutes
  • GPU exe size < 1 GB

Commit Message

feat(build): add GPU-enabled PyInstaller build with CuPy + CUDA

- New rfcp-server-gpu.spec with CuPy/CUDA collection
- Build scripts: build-gpu.bat, build-gpu.sh
- Graceful GPU detection in gpu_backend.py
- Two-tier build: CPU (~80MB) and GPU (~700MB) variants
- Auto-detection: RTX 4060 → CuPy acceleration
- Fallback: no CUDA → NumPy (CPU mode)

Iteration 3.6.0 — Production GPU Build

Files Summary

New Files:

File Purpose
installer/rfcp-server-gpu.spec PyInstaller config with CuPy
installer/build-gpu.bat Windows GPU build script
installer/build-gpu.sh Linux/WSL GPU build script

Modified Files:

File Changes
backend/app/services/gpu_backend.py Verify graceful detection
backend/app/main.py Health endpoint with build info
desktop/main.js or main.ts GPU status check after backend start
frontend/src/components/ui/GPUIndicator.tsx Verify badge shows GPU

No Changes Needed:

File Reason
installer/rfcp-server.spec CPU build stays as-is
backend/app/services/coverage_service.py Already uses get_array_module()
installer/build-win.sh Existing CPU build unchanged

Timeline

Phase Task Time
P1 Create rfcp-server-gpu.spec 30 min
P1 Build scripts 15 min
P1 Verify gpu_backend.py 15 min
P2 Frontend badge verification 10 min
P2 Electron GPU status 10 min
P3 Health endpoint update 5 min
Test Build + test standalone 20 min
Test Full Electron build 15 min
Total ~2 hours

Claude Code estimated time: 10-15 min (spec + scripts + backend changes) Manual testing by Олег: 30-45 min (building + verifying)