Files
rfcp/RFCP-3.6.0-GPU-Build-Task.md

5.4 KiB

RFCP 3.6.0 — Production GPU Build (Claude Code Task)

Goal

Build rfcp-server.exe (PyInstaller) with CuPy GPU support so production RFCP detects the NVIDIA GPU without manual pip install.

Currently production exe shows "CPU (NumPy)" because CuPy is not bundled.

Current Environment (CONFIRMED WORKING)

Windows 10 (10.0.26200)
Python 3.11.8 (C:\Python311)
NVIDIA GeForce RTX 4060 Laptop GPU (8 GB VRAM)
CUDA Toolkit 13.1 (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1)
CUDA_PATH = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1

Packages:
  cupy-cuda13x  13.6.0   ← NOT cuda12x!
  numpy         1.26.4
  scipy         1.17.0
  fastrlock     0.8.3
  pyinstaller   6.18.0

GPU compute verified:
  python -c "import cupy; a = cupy.array([1,2,3]); print(a.sum())"  → 6 ✅

What We Already Tried (And Why It Failed)

Attempt 1: ONEFILE spec with collect_all('cupy')

  • collect_all('cupy') returns 1882 datas, 0 binaries — CuPy pip doesn't bundle DLLs on Windows
  • CUDA DLLs come from two separate sources:
    • nvidia pip packages (14 DLLs in C:\Python311\Lib\site-packages\nvidia\*/bin/)
    • CUDA Toolkit (13 DLLs in C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1\bin\x64\)
  • We manually collected these 27 DLLs in the spec
  • Build succeeded (3 GB exe!) but crashed on launch:
    [PYI-10456:ERROR] Failed to extract cufft64_12.dll: decompression resulted in return code -1!
    
  • Root cause: cufft64_12.dll is 297 MB — PyInstaller's zlib compression fails on it in ONEFILE mode

Attempt 2: We were about to try ONEDIR but haven't built it yet

Key Insight: Duplicate DLLs from two sources

nvidia pip packages have CUDA 12.x DLLs (cublas64_12.dll etc.) CUDA Toolkit 13.1 has CUDA 13.x DLLs (cublas64_13.dll etc.) CuPy-cuda13x needs the 13.x versions. The 12.x from pip may conflict.

What Needs To Happen

  1. Build rfcp-server as ONEDIR (folder with exe + DLLs, not single exe)

    • This avoids the decompression crash with large CUDA DLLs
    • Output: backend/dist/rfcp-server/rfcp-server.exe + all DLLs alongside
  2. Include ONLY the correct CUDA DLLs

    • Prefer CUDA Toolkit 13.1 DLLs (match cupy-cuda13x)
    • The nvidia pip packages have cuda12x DLLs — may cause version conflicts
    • Key DLLs needed: cublas, cusparse, cusolver, curand, cufft, nvrtc, cudart
  3. Exclude bloat — the previous build pulled in tensorflow, grpc, opentelemetry etc. making it 3 GB. Real size should be ~600-800 MB.

  4. Test the built exe — run it standalone and verify:

    • curl http://localhost:8090/api/health returns "build": "gpu"
    • curl http://localhost:8090/api/gpu/status returns "available": true
    • Or at minimum: the exe starts without errors and CuPy imports successfully
  5. Update Electron integration if needed:

    • Current Electron expects a single rfcp-server.exe file
    • With ONEDIR, it's a folder rfcp-server/rfcp-server.exe
    • File: desktop/main.js or desktop/src/main.ts — look for where it spawns backend
    • The path needs to change from resources/backend/rfcp-server.exe to resources/backend/rfcp-server/rfcp-server.exe

File Locations

D:\root\rfcp\
├── backend\
│   ├── run_server.py          ← PyInstaller entry point
│   ├── app\
│   │   ├── main.py            ← FastAPI app
│   │   ├── services\
│   │   │   ├── gpu_backend.py ← GPU detection (CuPy/NumPy fallback)
│   │   │   └── coverage_service.py ← Uses get_array_module()
│   │   └── api\routes\gpu.py  ← /api/gpu/status, /api/gpu/diagnostics
│   ├── dist\                  ← PyInstaller output goes here
│   └── build\                 ← PyInstaller build cache
├── installer\
│   ├── rfcp-server-gpu.spec   ← GPU spec (needs fixing)
│   ├── rfcp-server.spec       ← CPU spec (working, don't touch)
│   ├── rfcp.ico               ← Icon (exists)
│   └── build-gpu.bat          ← Build script
├── desktop\
│   ├── main.js or src/main.ts ← Electron main process
│   └── resources\backend\     ← Where production exe lives
└── frontend\                  ← React frontend (no changes needed)

Existing CPU spec for reference

The working CPU-only spec is at installer/rfcp-server.spec. Use it as the base and ADD CuPy + CUDA on top. Don't reinvent the wheel.

Build Command

cd D:\root\rfcp\backend
pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm

Success Criteria

  • dist/rfcp-server/rfcp-server.exe starts without errors
  • CuPy imports successfully inside the exe (no missing DLL errors)
  • /api/gpu/status returns "available": true, "device": "RTX 4060"
  • Total folder size < 1 GB (ideally 600-800 MB)
  • No tensorflow/grpc/opentelemetry bloat
  • Electron can find and launch the backend (path updated if needed)

Important Notes

  • Do NOT use cupy-cuda12x — we migrated to cupy-cuda13x
  • Do NOT try ONEFILE mode — cufft64_12.dll (297 MB) crashes decompression
  • The nvidia pip packages (nvidia-cublas-cu12, etc.) are still installed but may conflict with CUDA Toolkit 13.1 — prefer Toolkit DLLs
  • collect_all('cupy') gives 0 binaries on Windows — DLLs must be manually specified
  • gpu_backend.py already handles CuPy absence gracefully (falls back to NumPy)