Files
rfcp/RFCP-3.6.0-GPU-Build-Task.md

131 lines
5.4 KiB
Markdown

# RFCP 3.6.0 — Production GPU Build (Claude Code Task)
## Goal
Build `rfcp-server.exe` (PyInstaller) with CuPy GPU support so production RFCP
detects the NVIDIA GPU without manual `pip install`.
Currently production exe shows "CPU (NumPy)" because CuPy is not bundled.
## Current Environment (CONFIRMED WORKING)
```
Windows 10 (10.0.26200)
Python 3.11.8 (C:\Python311)
NVIDIA GeForce RTX 4060 Laptop GPU (8 GB VRAM)
CUDA Toolkit 13.1 (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1)
CUDA_PATH = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1
Packages:
cupy-cuda13x 13.6.0 ← NOT cuda12x!
numpy 1.26.4
scipy 1.17.0
fastrlock 0.8.3
pyinstaller 6.18.0
GPU compute verified:
python -c "import cupy; a = cupy.array([1,2,3]); print(a.sum())" → 6 ✅
```
## What We Already Tried (And Why It Failed)
### Attempt 1: ONEFILE spec with collect_all('cupy')
- `collect_all('cupy')` returns 1882 datas, **0 binaries** — CuPy pip doesn't bundle DLLs on Windows
- CUDA DLLs come from two separate sources:
- **nvidia pip packages** (14 DLLs in `C:\Python311\Lib\site-packages\nvidia\*/bin/`)
- **CUDA Toolkit** (13 DLLs in `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1\bin\x64\`)
- We manually collected these 27 DLLs in the spec
- Build succeeded (3 GB exe!) but crashed on launch:
```
[PYI-10456:ERROR] Failed to extract cufft64_12.dll: decompression resulted in return code -1!
```
- Root cause: `cufft64_12.dll` is 297 MB — PyInstaller's zlib compression fails on it in ONEFILE mode
### Attempt 2: We were about to try ONEDIR but haven't built it yet
### Key Insight: Duplicate DLLs from two sources
nvidia pip packages have CUDA 12.x DLLs (cublas64_12.dll etc.)
CUDA Toolkit 13.1 has CUDA 13.x DLLs (cublas64_13.dll etc.)
CuPy-cuda13x needs the 13.x versions. The 12.x from pip may conflict.
## What Needs To Happen
1. **Build rfcp-server as ONEDIR** (folder with exe + DLLs, not single exe)
- This avoids the decompression crash with large CUDA DLLs
- Output: `backend/dist/rfcp-server/rfcp-server.exe` + all DLLs alongside
2. **Include ONLY the correct CUDA DLLs**
- Prefer CUDA Toolkit 13.1 DLLs (match cupy-cuda13x)
- The nvidia pip packages have cuda12x DLLs — may cause version conflicts
- Key DLLs needed: cublas, cusparse, cusolver, curand, cufft, nvrtc, cudart
3. **Exclude bloat** — the previous build pulled in tensorflow, grpc, opentelemetry etc.
making it 3 GB. Real size should be ~600-800 MB.
4. **Test the built exe** — run it standalone and verify:
- `curl http://localhost:8090/api/health` returns `"build": "gpu"`
- `curl http://localhost:8090/api/gpu/status` returns `"available": true`
- Or at minimum: the exe starts without errors and CuPy imports successfully
5. **Update Electron integration** if needed:
- Current Electron expects a single `rfcp-server.exe` file
- With ONEDIR, it's a folder `rfcp-server/rfcp-server.exe`
- File: `desktop/main.js` or `desktop/src/main.ts` — look for where it spawns backend
- The path needs to change from `resources/backend/rfcp-server.exe`
to `resources/backend/rfcp-server/rfcp-server.exe`
## File Locations
```
D:\root\rfcp\
├── backend\
│ ├── run_server.py ← PyInstaller entry point
│ ├── app\
│ │ ├── main.py ← FastAPI app
│ │ ├── services\
│ │ │ ├── gpu_backend.py ← GPU detection (CuPy/NumPy fallback)
│ │ │ └── coverage_service.py ← Uses get_array_module()
│ │ └── api\routes\gpu.py ← /api/gpu/status, /api/gpu/diagnostics
│ ├── dist\ ← PyInstaller output goes here
│ └── build\ ← PyInstaller build cache
├── installer\
│ ├── rfcp-server-gpu.spec ← GPU spec (needs fixing)
│ ├── rfcp-server.spec ← CPU spec (working, don't touch)
│ ├── rfcp.ico ← Icon (exists)
│ └── build-gpu.bat ← Build script
├── desktop\
│ ├── main.js or src/main.ts ← Electron main process
│ └── resources\backend\ ← Where production exe lives
└── frontend\ ← React frontend (no changes needed)
```
## Existing CPU spec for reference
The working CPU-only spec is at `installer/rfcp-server.spec`. Use it as the base
and ADD CuPy + CUDA on top. Don't reinvent the wheel.
## Build Command
```powershell
cd D:\root\rfcp\backend
pyinstaller ..\installer\rfcp-server-gpu.spec --clean --noconfirm
```
## Success Criteria
- [ ] `dist/rfcp-server/rfcp-server.exe` starts without errors
- [ ] CuPy imports successfully inside the exe (no missing DLL errors)
- [ ] `/api/gpu/status` returns `"available": true, "device": "RTX 4060"`
- [ ] Total folder size < 1 GB (ideally 600-800 MB)
- [ ] No tensorflow/grpc/opentelemetry bloat
- [ ] Electron can find and launch the backend (path updated if needed)
## Important Notes
- Do NOT use cupy-cuda12x — we migrated to cupy-cuda13x
- Do NOT try ONEFILE mode — cufft64_12.dll (297 MB) crashes decompression
- The nvidia pip packages (nvidia-cublas-cu12, etc.) are still installed but may
conflict with CUDA Toolkit 13.1 — prefer Toolkit DLLs
- `collect_all('cupy')` gives 0 binaries on Windows — DLLs must be manually specified
- gpu_backend.py already handles CuPy absence gracefully (falls back to NumPy)