# RFCP 3.6.0 — Production GPU Build (Claude Code Task) ## Goal Build `rfcp-server.exe` (PyInstaller) with CuPy GPU support so production RFCP detects the NVIDIA GPU without manual `pip install`. Currently production exe shows "CPU (NumPy)" because CuPy is not bundled. ## Current Environment (CONFIRMED WORKING) ``` Windows 10 (10.0.26200) Python 3.11.8 (C:\Python311) NVIDIA GeForce RTX 4060 Laptop GPU (8 GB VRAM) CUDA Toolkit 13.1 (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1) CUDA_PATH = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1 Packages: cupy-cuda13x 13.6.0 ← NOT cuda12x! numpy 1.26.4 scipy 1.17.0 fastrlock 0.8.3 pyinstaller 6.18.0 GPU compute verified: python -c "import cupy; a = cupy.array([1,2,3]); print(a.sum())" → 6 ✅ ``` ## What We Already Tried (And Why It Failed) ### Attempt 1: ONEFILE spec with collect_all('cupy') - `collect_all('cupy')` returns 1882 datas, **0 binaries** — CuPy pip doesn't bundle DLLs on Windows - CUDA DLLs come from two separate sources: - **nvidia pip packages** (14 DLLs in `C:\Python311\Lib\site-packages\nvidia\*/bin/`) - **CUDA Toolkit** (13 DLLs in `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1\bin\x64\`) - We manually collected these 27 DLLs in the spec - Build succeeded (3 GB exe!) but crashed on launch: ``` [PYI-10456:ERROR] Failed to extract cufft64_12.dll: decompression resulted in return code -1! ``` - Root cause: `cufft64_12.dll` is 297 MB — PyInstaller's zlib compression fails on it in ONEFILE mode ### Attempt 2: We were about to try ONEDIR but haven't built it yet ### Key Insight: Duplicate DLLs from two sources nvidia pip packages have CUDA 12.x DLLs (cublas64_12.dll etc.) CUDA Toolkit 13.1 has CUDA 13.x DLLs (cublas64_13.dll etc.) CuPy-cuda13x needs the 13.x versions. The 12.x from pip may conflict. ## What Needs To Happen 1. **Build rfcp-server as ONEDIR** (folder with exe + DLLs, not single exe) - This avoids the decompression crash with large CUDA DLLs - Output: `backend/dist/rfcp-server/rfcp-server.exe` + all DLLs alongside 2. **Include ONLY the correct CUDA DLLs** - Prefer CUDA Toolkit 13.1 DLLs (match cupy-cuda13x) - The nvidia pip packages have cuda12x DLLs — may cause version conflicts - Key DLLs needed: cublas, cusparse, cusolver, curand, cufft, nvrtc, cudart 3. **Exclude bloat** — the previous build pulled in tensorflow, grpc, opentelemetry etc. making it 3 GB. Real size should be ~600-800 MB. 4. **Test the built exe** — run it standalone and verify: - `curl http://localhost:8090/api/health` returns `"build": "gpu"` - `curl http://localhost:8090/api/gpu/status` returns `"available": true` - Or at minimum: the exe starts without errors and CuPy imports successfully 5. **Update Electron integration** if needed: - Current Electron expects a single `rfcp-server.exe` file - With ONEDIR, it's a folder `rfcp-server/rfcp-server.exe` - File: `desktop/main.js` or `desktop/src/main.ts` — look for where it spawns backend - The path needs to change from `resources/backend/rfcp-server.exe` to `resources/backend/rfcp-server/rfcp-server.exe` ## File Locations ``` D:\root\rfcp\ ├── backend\ │ ├── run_server.py ← PyInstaller entry point │ ├── app\ │ │ ├── main.py ← FastAPI app │ │ ├── services\ │ │ │ ├── gpu_backend.py ← GPU detection (CuPy/NumPy fallback) │ │ │ └── coverage_service.py ← Uses get_array_module() │ │ └── api\routes\gpu.py ← /api/gpu/status, /api/gpu/diagnostics │ ├── dist\ ← PyInstaller output goes here │ └── build\ ← PyInstaller build cache ├── installer\ │ ├── rfcp-server-gpu.spec ← GPU spec (needs fixing) │ ├── rfcp-server.spec ← CPU spec (working, don't touch) │ ├── rfcp.ico ← Icon (exists) │ └── build-gpu.bat ← Build script ├── desktop\ │ ├── main.js or src/main.ts ← Electron main process │ └── resources\backend\ ← Where production exe lives └── frontend\ ← React frontend (no changes needed) ``` ## Existing CPU spec for reference The working CPU-only spec is at `installer/rfcp-server.spec`. Use it as the base and ADD CuPy + CUDA on top. Don't reinvent the wheel. ## Build Command ```powershell cd D:\root\rfcp\backend pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm ``` ## Success Criteria - [ ] `dist/rfcp-server/rfcp-server.exe` starts without errors - [ ] CuPy imports successfully inside the exe (no missing DLL errors) - [ ] `/api/gpu/status` returns `"available": true, "device": "RTX 4060"` - [ ] Total folder size < 1 GB (ideally 600-800 MB) - [ ] No tensorflow/grpc/opentelemetry bloat - [ ] Electron can find and launch the backend (path updated if needed) ## Important Notes - Do NOT use cupy-cuda12x — we migrated to cupy-cuda13x - Do NOT try ONEFILE mode — cufft64_12.dll (297 MB) crashes decompression - The nvidia pip packages (nvidia-cublas-cu12, etc.) are still installed but may conflict with CUDA Toolkit 13.1 — prefer Toolkit DLLs - `collect_all('cupy')` gives 0 binaries on Windows — DLLs must be manually specified - gpu_backend.py already handles CuPy absence gracefully (falls back to NumPy)