5.4 KiB
RFCP 3.6.0 — Production GPU Build (Claude Code Task)
Goal
Build rfcp-server.exe (PyInstaller) with CuPy GPU support so production RFCP
detects the NVIDIA GPU without manual pip install.
Currently production exe shows "CPU (NumPy)" because CuPy is not bundled.
Current Environment (CONFIRMED WORKING)
Windows 10 (10.0.26200)
Python 3.11.8 (C:\Python311)
NVIDIA GeForce RTX 4060 Laptop GPU (8 GB VRAM)
CUDA Toolkit 13.1 (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1)
CUDA_PATH = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1
Packages:
cupy-cuda13x 13.6.0 ← NOT cuda12x!
numpy 1.26.4
scipy 1.17.0
fastrlock 0.8.3
pyinstaller 6.18.0
GPU compute verified:
python -c "import cupy; a = cupy.array([1,2,3]); print(a.sum())" → 6 ✅
What We Already Tried (And Why It Failed)
Attempt 1: ONEFILE spec with collect_all('cupy')
collect_all('cupy')returns 1882 datas, 0 binaries — CuPy pip doesn't bundle DLLs on Windows- CUDA DLLs come from two separate sources:
- nvidia pip packages (14 DLLs in
C:\Python311\Lib\site-packages\nvidia\*/bin/) - CUDA Toolkit (13 DLLs in
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1\bin\x64\)
- nvidia pip packages (14 DLLs in
- We manually collected these 27 DLLs in the spec
- Build succeeded (3 GB exe!) but crashed on launch:
[PYI-10456:ERROR] Failed to extract cufft64_12.dll: decompression resulted in return code -1! - Root cause:
cufft64_12.dllis 297 MB — PyInstaller's zlib compression fails on it in ONEFILE mode
Attempt 2: We were about to try ONEDIR but haven't built it yet
Key Insight: Duplicate DLLs from two sources
nvidia pip packages have CUDA 12.x DLLs (cublas64_12.dll etc.) CUDA Toolkit 13.1 has CUDA 13.x DLLs (cublas64_13.dll etc.) CuPy-cuda13x needs the 13.x versions. The 12.x from pip may conflict.
What Needs To Happen
-
Build rfcp-server as ONEDIR (folder with exe + DLLs, not single exe)
- This avoids the decompression crash with large CUDA DLLs
- Output:
backend/dist/rfcp-server/rfcp-server.exe+ all DLLs alongside
-
Include ONLY the correct CUDA DLLs
- Prefer CUDA Toolkit 13.1 DLLs (match cupy-cuda13x)
- The nvidia pip packages have cuda12x DLLs — may cause version conflicts
- Key DLLs needed: cublas, cusparse, cusolver, curand, cufft, nvrtc, cudart
-
Exclude bloat — the previous build pulled in tensorflow, grpc, opentelemetry etc. making it 3 GB. Real size should be ~600-800 MB.
-
Test the built exe — run it standalone and verify:
curl http://localhost:8090/api/healthreturns"build": "gpu"curl http://localhost:8090/api/gpu/statusreturns"available": true- Or at minimum: the exe starts without errors and CuPy imports successfully
-
Update Electron integration if needed:
- Current Electron expects a single
rfcp-server.exefile - With ONEDIR, it's a folder
rfcp-server/rfcp-server.exe - File:
desktop/main.jsordesktop/src/main.ts— look for where it spawns backend - The path needs to change from
resources/backend/rfcp-server.exetoresources/backend/rfcp-server/rfcp-server.exe
- Current Electron expects a single
File Locations
D:\root\rfcp\
├── backend\
│ ├── run_server.py ← PyInstaller entry point
│ ├── app\
│ │ ├── main.py ← FastAPI app
│ │ ├── services\
│ │ │ ├── gpu_backend.py ← GPU detection (CuPy/NumPy fallback)
│ │ │ └── coverage_service.py ← Uses get_array_module()
│ │ └── api\routes\gpu.py ← /api/gpu/status, /api/gpu/diagnostics
│ ├── dist\ ← PyInstaller output goes here
│ └── build\ ← PyInstaller build cache
├── installer\
│ ├── rfcp-server-gpu.spec ← GPU spec (needs fixing)
│ ├── rfcp-server.spec ← CPU spec (working, don't touch)
│ ├── rfcp.ico ← Icon (exists)
│ └── build-gpu.bat ← Build script
├── desktop\
│ ├── main.js or src/main.ts ← Electron main process
│ └── resources\backend\ ← Where production exe lives
└── frontend\ ← React frontend (no changes needed)
Existing CPU spec for reference
The working CPU-only spec is at installer/rfcp-server.spec. Use it as the base
and ADD CuPy + CUDA on top. Don't reinvent the wheel.
Build Command
cd D:\root\rfcp\backend
pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm
Success Criteria
dist/rfcp-server/rfcp-server.exestarts without errors- CuPy imports successfully inside the exe (no missing DLL errors)
/api/gpu/statusreturns"available": true, "device": "RTX 4060"- Total folder size < 1 GB (ideally 600-800 MB)
- No tensorflow/grpc/opentelemetry bloat
- Electron can find and launch the backend (path updated if needed)
Important Notes
- Do NOT use cupy-cuda12x — we migrated to cupy-cuda13x
- Do NOT try ONEFILE mode — cufft64_12.dll (297 MB) crashes decompression
- The nvidia pip packages (nvidia-cublas-cu12, etc.) are still installed but may conflict with CUDA Toolkit 13.1 — prefer Toolkit DLLs
collect_all('cupy')gives 0 binaries on Windows — DLLs must be manually specified- gpu_backend.py already handles CuPy absence gracefully (falls back to NumPy)