Files
rfcp/RFCP-Phase-2.4.2-Final-Fixes.md
2026-02-01 12:02:52 +02:00

496 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# RFCP Phase 2.4.2: Final Critical Fixes
**Date:** February 1, 2025
**Type:** Bug Fixes
**Priority:** CRITICAL
**Depends on:** Phase 2.4.1
---
## 🎯 Summary
Phase 2.4.1 частково спрацював, але залишились проблеми:
- Memory leak — workers не вбиваються (psutil не працює для grandchildren)
- Dominant path — все ще timeout (фільтр можливо не застосовується)
- Elevation — все зелене (немає локального контрасту)
---
## 🔴 Bug 2.4.2a: Memory Leak — Nuclear Kill
**Problem:**
psutil.Process.children() не бачить grandchildren (worker subprocess → python subprocess).
Після cleanup все ще 8× rfcp-server.exe, 8GB RAM.
**File:** `backend/app/services/parallel_coverage_service.py`
**Current code doesn't work:**
```python
current = psutil.Process(os.getpid())
children = current.children(recursive=True)
for child in children:
child.terminate()
```
**Fix — kill by process NAME, not by PID tree:**
```python
import subprocess
import sys
def _kill_worker_processes():
"""
Nuclear option: kill ALL rfcp-server processes except main.
This handles grandchildren that psutil can't see.
"""
my_pid = os.getpid()
killed_count = 0
if sys.platform == 'win32':
# Windows: use tasklist to find all rfcp-server.exe, kill all except self
try:
# Get list of all rfcp-server PIDs
result = subprocess.run(
['tasklist', '/FI', 'IMAGENAME eq rfcp-server.exe', '/FO', 'CSV', '/NH'],
capture_output=True, text=True, timeout=5
)
for line in result.stdout.strip().split('\n'):
if 'rfcp-server.exe' in line:
# Parse PID from CSV: "rfcp-server.exe","1234",...
parts = line.split(',')
if len(parts) >= 2:
pid_str = parts[1].strip().strip('"')
try:
pid = int(pid_str)
if pid != my_pid:
subprocess.run(
['taskkill', '/F', '/PID', str(pid)],
capture_output=True, timeout=5
)
killed_count += 1
_clog(f"Killed worker PID {pid}")
except (ValueError, subprocess.TimeoutExpired):
pass
except Exception as e:
_clog(f"Kill workers error: {e}")
# Fallback: kill ALL rfcp-server except hope main survives
try:
subprocess.run(
['taskkill', '/F', '/IM', 'rfcp-server.exe', '/T'],
capture_output=True, timeout=5
)
except:
pass
else:
# Unix: use pgrep/pkill
try:
result = subprocess.run(
['pgrep', '-f', 'rfcp-server'],
capture_output=True, text=True, timeout=5
)
for pid_str in result.stdout.strip().split('\n'):
if pid_str:
try:
pid = int(pid_str)
if pid != my_pid:
os.kill(pid, 9) # SIGKILL
killed_count += 1
_clog(f"Killed worker PID {pid}")
except (ValueError, ProcessLookupError, PermissionError):
pass
except Exception as e:
_clog(f"Kill workers error: {e}")
return killed_count
```
**Also update ProcessPoolExecutor to use spawn context explicitly:**
```python
import multiprocessing as mp
def _calculate_with_process_pool(...):
# Use spawn to ensure clean worker processes
ctx = mp.get_context('spawn')
pool = None
try:
pool = ProcessPoolExecutor(
max_workers=num_workers,
mp_context=ctx
)
# ... rest of code ...
finally:
if pool:
pool.shutdown(wait=False, cancel_futures=True)
# Give pool time to cleanup
import time
time.sleep(0.5)
# Then force kill any survivors
killed = _kill_worker_processes()
if killed > 0:
_clog(f"Force killed {killed} orphaned workers")
```
---
## 🔴 Bug 2.4.2b: Dominant Path — Add Logging + Reduce Limits
**Problem:**
Detailed preset still timeouts. Unknown if filter is being applied.
**File:** `backend/app/services/dominant_path_service.py`
**Add diagnostic logging to verify filter works:**
```python
# At the TOP of the file, add constants (if not already there):
MAX_BUILDINGS_FOR_LINE = 50 # Reduced from 100
MAX_BUILDINGS_FOR_REFLECTION = 30 # Reduced from 100
MAX_DISTANCE_FROM_PATH = 300 # Reduced from 500m
def _filter_buildings_by_distance(buildings, tx_point, rx_point, max_count, max_distance):
"""Filter buildings to nearest N within max_distance of path."""
original_count = len(buildings)
if original_count <= max_count:
_log(f"[FILTER] {original_count} buildings, no filter needed")
return buildings
# Calculate midpoint
mid_lat = (tx_point[0] + rx_point[0]) / 2
mid_lon = (tx_point[1] + rx_point[1]) / 2
# Calculate squared distance to midpoint (no sqrt for speed)
def dist_sq(b):
blat = b.get('centroid_lat') or b.get('lat', mid_lat)
blon = b.get('centroid_lon') or b.get('lon', mid_lon)
dlat = (blat - mid_lat) * 111000
dlon = (blon - mid_lon) * 111000 * 0.65 # cos(50°) ≈ 0.65
return dlat*dlat + dlon*dlon
# Sort by distance
buildings_sorted = sorted(buildings, key=dist_sq)
# Filter by max distance
max_dist_sq = max_distance * max_distance
filtered = [b for b in buildings_sorted if dist_sq(b) <= max_dist_sq]
# Take top N
result = filtered[:max_count]
_log(f"[FILTER] {original_count}{len(result)} buildings (max_count={max_count}, max_dist={max_distance}m)")
return result
```
**Verify the filter is CALLED in the main function:**
```python
def find_dominant_path_sync(tx, rx, buildings, vegetation, spatial_idx, frequency, ...):
"""Find dominant propagation path."""
# Get buildings from spatial index
line_buildings_raw = spatial_idx.query_line(tx['lat'], tx['lon'], rx['lat'], rx['lon'])
# FILTER - this MUST be called
line_buildings = _filter_buildings_by_distance(
line_buildings_raw,
(tx['lat'], tx['lon']),
(rx['lat'], rx['lon']),
max_count=MAX_BUILDINGS_FOR_LINE,
max_distance=MAX_DISTANCE_FROM_PATH
)
# Same for reflection candidates
mid_lat = (tx['lat'] + rx['lat']) / 2
mid_lon = (tx['lon'] + rx['lon']) / 2
refl_buildings_raw = spatial_idx.query_point(mid_lat, mid_lon, buffer_cells=3)
refl_buildings = _filter_buildings_by_distance(
refl_buildings_raw,
(tx['lat'], tx['lon']),
(rx['lat'], rx['lon']),
max_count=MAX_BUILDINGS_FOR_REFLECTION,
max_distance=MAX_DISTANCE_FROM_PATH
)
# Update diagnostic log to show FILTERED counts
if _point_counter[0] <= 3:
print(f"[DOMINANT_PATH] Point #{_point_counter[0]}: "
f"line_bldgs={len(line_buildings)} (was {len(line_buildings_raw)}), "
f"refl_bldgs={len(refl_buildings)} (was {len(refl_buildings_raw)})")
# ... rest of function ...
```
**If still too slow — option to DISABLE reflections:**
```python
# Quick fix: skip reflection calculation entirely
ENABLE_REFLECTIONS = False # Set to True when performance is fixed
def find_dominant_path_sync(...):
# Direct path
direct_loss = calculate_direct_path(...)
if not ENABLE_REFLECTIONS:
return direct_loss
# Reflection paths (slow)
reflection_loss = calculate_reflections(...)
return min(direct_loss, reflection_loss)
```
---
## 🟡 Bug 2.4.2c: Elevation Layer — Local Min/Max Contrast
**Problem:**
All green because using absolute thresholds (100m, 150m, 200m...) but local terrain varies only 150-200m.
**File:** `frontend/src/components/map/ElevationLayer.tsx`
**Fix — use RELATIVE coloring based on local min/max:**
```tsx
// Color palette (keep these)
const COLORS = {
DEEP_BLUE: [33, 102, 172], // Lowest
LIGHT_BLUE: [103, 169, 207],
GREEN: [145, 207, 96],
YELLOW: [254, 224, 139],
ORANGE: [252, 141, 89],
BROWN: [215, 48, 39], // Highest
};
// Interpolate between two colors
function interpolateColor(
color1: number[],
color2: number[],
t: number
): [number, number, number] {
return [
Math.round(color1[0] + (color2[0] - color1[0]) * t),
Math.round(color1[1] + (color2[1] - color1[1]) * t),
Math.round(color1[2] + (color2[2] - color1[2]) * t),
];
}
// NEW: Get color based on NORMALIZED elevation (0-1)
function getColorForNormalizedElevation(normalized: number): [number, number, number] {
// Clamp to 0-1
const n = Math.max(0, Math.min(1, normalized));
if (n < 0.2) {
// 0-20%: deep blue → light blue
return interpolateColor(COLORS.DEEP_BLUE, COLORS.LIGHT_BLUE, n / 0.2);
} else if (n < 0.4) {
// 20-40%: light blue → green
return interpolateColor(COLORS.LIGHT_BLUE, COLORS.GREEN, (n - 0.2) / 0.2);
} else if (n < 0.6) {
// 40-60%: green → yellow
return interpolateColor(COLORS.GREEN, COLORS.YELLOW, (n - 0.4) / 0.2);
} else if (n < 0.8) {
// 60-80%: yellow → orange
return interpolateColor(COLORS.YELLOW, COLORS.ORANGE, (n - 0.6) / 0.2);
} else {
// 80-100%: orange → brown
return interpolateColor(COLORS.ORANGE, COLORS.BROWN, (n - 0.8) / 0.2);
}
}
// In the main render function, USE local min/max:
useEffect(() => {
// ... fetch elevation data ...
const data = await response.json();
// Get LOCAL min/max from the actual data
const minElev = data.min_elevation; // e.g., 152
const maxElev = data.max_elevation; // e.g., 198
const elevRange = maxElev - minElev || 1; // Avoid division by zero
console.log(`[Elevation] Local range: ${minElev}m - ${maxElev}m (${elevRange}m difference)`);
// Fill pixel data with NORMALIZED colors
for (let i = 0; i < data.rows; i++) {
for (let j = 0; j < data.cols; j++) {
const elevation = data.elevations[i][j];
// Normalize to 0-1 based on LOCAL range
const normalized = (elevation - minElev) / elevRange;
const color = getColorForNormalizedElevation(normalized);
const idx = (i * data.cols + j) * 4;
imageData.data[idx] = color[0]; // R
imageData.data[idx + 1] = color[1]; // G
imageData.data[idx + 2] = color[2]; // B
imageData.data[idx + 3] = 255; // A (full opacity, layer opacity handled separately)
}
}
// ... rest of canvas/overlay code ...
}, [enabled, opacity, bbox, map]);
```
**Also add elevation legend showing LOCAL range:**
```tsx
// In the parent component (App.tsx or Map.tsx), show legend:
{showElevation && elevationRange && (
<div className="elevation-legend">
<div className="legend-title">Elevation</div>
<div className="legend-gradient"></div>
<div className="legend-labels">
<span>{elevationRange.min}m</span>
<span>{elevationRange.max}m</span>
</div>
</div>
)}
// CSS for gradient:
.legend-gradient {
height: 10px;
background: linear-gradient(to right,
#2166ac, /* deep blue - low */
#67a9cf, /* light blue */
#91cf60, /* green */
#fee08b, /* yellow */
#fc8d59, /* orange */
#d73027 /* brown - high */
);
border-radius: 2px;
}
```
---
## 🟢 Enhancement 2.4.2d: GPU Install Message
**File:** `backend/app/services/gpu_service.py`
**Add clear install instructions on startup:**
```python
# At module init:
GPU_AVAILABLE = False
GPU_INFO = None
try:
import cupy as cp
# Check CUDA
device_count = cp.cuda.runtime.getDeviceCount()
if device_count > 0:
GPU_AVAILABLE = True
props = cp.cuda.runtime.getDeviceProperties(0)
GPU_INFO = {
'name': props['name'].decode() if isinstance(props['name'], bytes) else str(props['name']),
'memory_mb': props['totalGlobalMem'] // (1024 * 1024),
'cuda_version': cp.cuda.runtime.runtimeGetVersion(),
}
print(f"[GPU] ✓ CUDA available: {GPU_INFO['name']} ({GPU_INFO['memory_mb']} MB)")
else:
print("[GPU] ✗ No CUDA devices found")
except ImportError:
print("[GPU] ✗ CuPy not installed — using CPU/NumPy")
print("[GPU] To enable GPU acceleration, install CuPy:")
print("[GPU] ")
print("[GPU] For CUDA 12.x: pip install cupy-cuda12x")
print("[GPU] For CUDA 11.x: pip install cupy-cuda11x")
print("[GPU] ")
print("[GPU] Check CUDA version: nvidia-smi")
except Exception as e:
print(f"[GPU] ✗ CuPy error: {e}")
print("[GPU] GPU acceleration disabled")
```
---
## 📁 Files to Modify
| File | Changes |
|------|---------|
| `backend/app/services/parallel_coverage_service.py` | Rewrite `_kill_worker_processes()` to kill by name; add spawn context |
| `backend/app/services/dominant_path_service.py` | Add detailed filter logging; reduce limits to 50/30; add ENABLE_REFLECTIONS flag |
| `frontend/src/components/map/ElevationLayer.tsx` | Use local min/max for color normalization |
| `backend/app/services/gpu_service.py` | Add clear install instructions |
---
## 🧪 Testing
### Test 1: Memory Cleanup
```bash
# Run Detailed preset (will timeout)
# Watch Task Manager during and after
# After timeout message:
# - Should see only 1 rfcp-server.exe
# - RAM should drop from 8GB to <500MB
```
### Test 2: Dominant Path Logging
```bash
# Run debug mode, watch console
# Should see:
# [FILTER] 646 → 50 buildings (max_count=50, max_dist=300m)
# [DOMINANT_PATH] Point #1: line_bldgs=50 (was 646), refl_bldgs=30 (was 302)
```
### Test 3: Elevation Contrast
```bash
# Open app
# Enable elevation layer
# Should see color variation:
# - Blue in valleys
# - Green/yellow on slopes
# - Brown/orange on hills
# Console should show: "[Elevation] Local range: 152m - 198m"
```
### Test 4: GPU Message
```bash
# Start server, check console
# Should see clear message about CuPy install
```
---
## ✅ Success Criteria
- [ ] After timeout: only 1 rfcp-server.exe, RAM < 500MB
- [ ] Dominant path logs show filtered counts (50 buildings, not 600)
- [ ] Detailed preset completes in <120s OR logs explain why still slow
- [ ] Elevation layer shows visible terrain contrast
- [ ] GPU install instructions visible in console
---
## 📈 Expected Results After Fixes
| Metric | Before | After |
|--------|--------|-------|
| Workers after timeout | 8 (7.8GB) | 1 (<500MB) |
| Buildings per point | 600+ | 50 |
| Detailed time | 300s timeout | ~60-90s |
| Elevation | All green | Color gradient |
---
## 🔜 Next Phase
After 2.4.2:
- Phase 2.5: Loading screen with fun facts
- Phase 2.5: Better error messages in UI
- Phase 2.6: Export coverage to GeoJSON/KML