# RFCP Phase 2.5.0: NumPy Vectorization

**Date:** February 1, 2025  
**Type:** Performance Optimization  
**Priority:** HIGH  
**Goal:** 10-50x speedup for Detailed preset via NumPy vectorization

---

## 🎯 Problem

Detailed preset: **346ms/point** — way too slow, causes 5 min timeout.

Root cause: Python loops in dominant_path_service.py
```python
for building in buildings:      # 50 iterations
    for reflector in reflectors:  # 30 iterations
        # Math operations...
```

**1500 Python loop iterations** with function calls = slow.

---

## 🚀 Solution: NumPy Vectorization

Replace Python loops with NumPy batch operations:
- **Before:** 1500 function calls, 1500 loop iterations
- **After:** ~10 function calls, 0 Python loops
- **Expected speedup:** 10-50x

---

## 📁 Files to Create/Modify

### NEW: `backend/app/services/geometry_vectorized.py`

Core vectorized geometry functions:

```python
"""
Vectorized geometry operations using NumPy.
All functions operate on arrays, not single values.
"""
import numpy as np
from typing import Tuple, Optional

# Earth radius in meters
EARTH_RADIUS = 6371000


def haversine_batch(
    lat1: float, lon1: float,
    lats2: np.ndarray, lons2: np.ndarray
) -> np.ndarray:
    """
    Calculate distances from one point to many points.
    
    Args:
        lat1, lon1: Single origin point (degrees)
        lats2, lons2: Arrays of destination points (degrees), shape (N,)
    
    Returns:
        distances: Array of distances in meters, shape (N,)
    """
    lat1_rad = np.radians(lat1)
    lon1_rad = np.radians(lon1)
    lats2_rad = np.radians(lats2)
    lons2_rad = np.radians(lons2)
    
    dlat = lats2_rad - lat1_rad
    dlon = lons2_rad - lon1_rad
    
    a = np.sin(dlat / 2) ** 2 + np.cos(lat1_rad) * np.cos(lats2_rad) * np.sin(dlon / 2) ** 2
    c = 2 * np.arcsin(np.sqrt(a))
    
    return EARTH_RADIUS * c


def haversine_matrix(
    lats1: np.ndarray, lons1: np.ndarray,
    lats2: np.ndarray, lons2: np.ndarray
) -> np.ndarray:
    """
    Calculate distances between all pairs of points (M×N matrix).
    
    Args:
        lats1, lons1: First set of points, shape (M,)
        lats2, lons2: Second set of points, shape (N,)
    
    Returns:
        distances: Matrix of distances, shape (M, N)
    """
    # Reshape for broadcasting: (M, 1) and (1, N)
    lats1_rad = np.radians(lats1[:, np.newaxis])
    lons1_rad = np.radians(lons1[:, np.newaxis])
    lats2_rad = np.radians(lats2[np.newaxis, :])
    lons2_rad = np.radians(lons2[np.newaxis, :])
    
    dlat = lats2_rad - lats1_rad
    dlon = lons2_rad - lons1_rad
    
    a = np.sin(dlat / 2) ** 2 + np.cos(lats1_rad) * np.cos(lats2_rad) * np.sin(dlon / 2) ** 2
    c = 2 * np.arcsin(np.sqrt(a))
    
    return EARTH_RADIUS * c


def points_to_local_coords(
    ref_lat: float, ref_lon: float,
    lats: np.ndarray, lons: np.ndarray
) -> Tuple[np.ndarray, np.ndarray]:
    """
    Convert lat/lon to local X/Y coordinates (meters from reference).
    Uses simple equirectangular projection (good for small areas).
    
    Args:
        ref_lat, ref_lon: Reference point (degrees)
        lats, lons: Points to convert, shape (N,)
    
    Returns:
        x, y: Local coordinates in meters, shape (N,)
    """
    cos_lat = np.cos(np.radians(ref_lat))
    
    x = (lons - ref_lon) * 111320 * cos_lat
    y = (lats - ref_lat) * 110540
    
    return x, y


def line_segments_intersect_batch(
    p1: np.ndarray, p2: np.ndarray,
    segments_start: np.ndarray, segments_end: np.ndarray
) -> Tuple[np.ndarray, np.ndarray]:
    """
    Check if line segment p1→p2 intersects with multiple segments.
    Uses vectorized cross-product method.
    
    Args:
        p1: Start point (2,) — [x, y]
        p2: End point (2,) — [x, y]
        segments_start: Segment start points (N, 2)
        segments_end: Segment end points (N, 2)
    
    Returns:
        intersects: Boolean array (N,) — True if intersects
        t_values: Parameter values (N,) — where along p1→p2 intersection occurs
    """
    d = p2 - p1  # Direction of main line
    
    # Segment directions
    seg_d = segments_end - segments_start  # (N, 2)
    
    # Cross product for parallel check
    cross = d[0] * seg_d[:, 1] - d[1] * seg_d[:, 0]  # (N,)
    
    # Avoid division by zero
    parallel_mask = np.abs(cross) < 1e-10
    cross_safe = np.where(parallel_mask, 1.0, cross)
    
    # Vector from segment start to p1
    dp = p1 - segments_start  # (N, 2)
    
    # Calculate t (parameter on main line) and u (parameter on segments)
    t = (dp[:, 0] * seg_d[:, 1] - dp[:, 1] * seg_d[:, 0]) / cross_safe
    u = (dp[:, 0] * d[1] - dp[:, 1] * d[0]) / cross_safe
    
    # Intersection if 0 <= t <= 1 and 0 <= u <= 1
    intersects = ~parallel_mask & (t >= 0) & (t <= 1) & (u >= 0) & (u <= 1)
    
    return intersects, t


def line_intersects_polygons_batch(
    p1: np.ndarray, p2: np.ndarray,
    polygons_x: np.ndarray, polygons_y: np.ndarray,
    polygon_lengths: np.ndarray
) -> Tuple[np.ndarray, np.ndarray]:
    """
    Check if line p1→p2 intersects multiple polygons.
    
    Args:
        p1: Start point (2,) — [x, y]
        p2: End point (2,) — [x, y]
        polygons_x: Flattened polygon X coords (total_vertices,)
        polygons_y: Flattened polygon Y coords (total_vertices,)
        polygon_lengths: Number of vertices per polygon (num_polygons,)
    
    Returns:
        intersects: Boolean array (num_polygons,)
        min_distances: Distance to first intersection (num_polygons,)
    """
    num_polygons = len(polygon_lengths)
    intersects = np.zeros(num_polygons, dtype=bool)
    min_t = np.ones(num_polygons) * np.inf
    
    # Process each polygon
    idx = 0
    for i, length in enumerate(polygon_lengths):
        if length < 3:
            idx += length
            continue
            
        # Get polygon vertices
        px = polygons_x[idx:idx + length]
        py = polygons_y[idx:idx + length]
        
        # Create edge segments (including closing edge)
        starts = np.stack([px, py], axis=1)  # (length, 2)
        ends = np.stack([np.roll(px, -1), np.roll(py, -1)], axis=1)  # (length, 2)
        
        # Check intersections with all edges
        edge_intersects, t_vals = line_segments_intersect_batch(p1, p2, starts, ends)
        
        if np.any(edge_intersects):
            intersects[i] = True
            min_t[i] = np.min(t_vals[edge_intersects])
        
        idx += length
    
    # Convert t to distance
    line_length = np.linalg.norm(p2 - p1)
    min_distances = min_t * line_length
    
    return intersects, min_distances


def calculate_reflection_points_batch(
    tx: np.ndarray,
    rx: np.ndarray,
    wall_starts: np.ndarray,
    wall_ends: np.ndarray
) -> Tuple[np.ndarray, np.ndarray]:
    """
    Calculate reflection points on multiple walls.
    Uses mirror image method.
    
    Args:
        tx: Transmitter position (2,) — [x, y]
        rx: Receiver position (2,) — [x, y]
        wall_starts: Wall start points (N, 2)
        wall_ends: Wall end points (N, 2)
    
    Returns:
        reflection_points: Reflection point on each wall (N, 2)
        valid: Boolean mask for valid reflections (N,)
    """
    # Wall vectors and normals
    wall_vec = wall_ends - wall_starts  # (N, 2)
    wall_length = np.linalg.norm(wall_vec, axis=1, keepdims=True)
    wall_unit = wall_vec / np.maximum(wall_length, 1e-10)  # (N, 2)
    
    # Normal vectors (perpendicular to wall)
    normals = np.stack([-wall_unit[:, 1], wall_unit[:, 0]], axis=1)  # (N, 2)
    
    # Mirror TX across each wall
    tx_to_wall = tx - wall_starts  # (N, 2)
    tx_dist_to_wall = np.sum(tx_to_wall * normals, axis=1, keepdims=True)  # (N, 1)
    tx_mirror = tx - 2 * tx_dist_to_wall * normals  # (N, 2)
    
    # Find intersection of rx→tx_mirror with wall
    # Parametric: wall_start + t * wall_vec = rx + s * (tx_mirror - rx)
    rx_to_mirror = tx_mirror - rx  # (N, 2)
    
    # Solve for t (position along wall)
    cross_denom = (rx_to_mirror[:, 0] * wall_vec[:, 1] - 
                   rx_to_mirror[:, 1] * wall_vec[:, 0])
    
    # Avoid division by zero
    valid_denom = np.abs(cross_denom) > 1e-10
    cross_denom_safe = np.where(valid_denom, cross_denom, 1.0)
    
    rx_to_start = wall_starts - rx  # (N, 2)
    t = (rx_to_start[:, 0] * rx_to_mirror[:, 1] - 
         rx_to_start[:, 1] * rx_to_mirror[:, 0]) / cross_denom_safe
    
    # Reflection point
    reflection_points = wall_starts + t[:, np.newaxis] * wall_vec
    
    # Valid if t in [0, 1] and TX is on correct side of wall
    valid = valid_denom & (t >= 0) & (t <= 1) & (tx_dist_to_wall[:, 0] > 0)
    
    return reflection_points, valid


def find_best_reflection_path_vectorized(
    tx: np.ndarray,
    rx: np.ndarray,
    building_walls_start: np.ndarray,
    building_walls_end: np.ndarray,
    wall_to_building: np.ndarray,
    obstacle_polygons_x: np.ndarray,
    obstacle_polygons_y: np.ndarray,
    obstacle_lengths: np.ndarray,
    max_candidates: int = 50
) -> Tuple[Optional[np.ndarray], float, float]:
    """
    Find best single-reflection path using vectorized operations.
    
    Args:
        tx: Transmitter [x, y]
        rx: Receiver [x, y]
        building_walls_start: All wall start points (W, 2)
        building_walls_end: All wall end points (W, 2)
        wall_to_building: Mapping wall index → building index (W,)
        obstacle_*: Obstacle polygons for LOS checks
        max_candidates: Max reflection candidates to evaluate
    
    Returns:
        best_reflection_point: [x, y] or None
        best_path_length: Total path length
        best_reflection_loss: dB loss from reflection
    """
    num_walls = len(building_walls_start)
    
    if num_walls == 0:
        return None, np.inf, 0.0
    
    # Step 1: Calculate all reflection points at once
    refl_points, valid = calculate_reflection_points_batch(
        tx, rx, building_walls_start, building_walls_end
    )
    
    if not np.any(valid):
        return None, np.inf, 0.0
    
    # Step 2: Calculate path lengths for valid reflections
    valid_indices = np.where(valid)[0]
    valid_refl = refl_points[valid]  # (V, 2)
    
    # TX → reflection distances
    tx_to_refl = np.linalg.norm(valid_refl - tx, axis=1)  # (V,)
    
    # Reflection → RX distances
    refl_to_rx = np.linalg.norm(rx - valid_refl, axis=1)  # (V,)
    
    # Total path lengths
    path_lengths = tx_to_refl + refl_to_rx  # (V,)
    
    # Step 3: Sort by path length, take top candidates
    if len(valid_indices) > max_candidates:
        top_indices = np.argpartition(path_lengths, max_candidates)[:max_candidates]
        valid_indices = valid_indices[top_indices]
        valid_refl = valid_refl[top_indices]
        path_lengths = path_lengths[top_indices]
        tx_to_refl = tx_to_refl[top_indices]
        refl_to_rx = refl_to_rx[top_indices]
    
    # Step 4: Check LOS for each candidate (this is still a loop, but limited)
    best_idx = -1
    best_length = np.inf
    
    for i, (refl_pt, length) in enumerate(zip(valid_refl, path_lengths)):
        # Skip if already longer than best
        if length >= best_length:
            continue
        
        # Check TX → reflection LOS
        intersects1, _ = line_intersects_polygons_batch(
            tx, refl_pt, obstacle_polygons_x, obstacle_polygons_y, obstacle_lengths
        )
        
        if np.any(intersects1):
            continue
        
        # Check reflection → RX LOS
        intersects2, _ = line_intersects_polygons_batch(
            refl_pt, rx, obstacle_polygons_x, obstacle_polygons_y, obstacle_lengths
        )
        
        if np.any(intersects2):
            continue
        
        # Valid path found
        best_idx = i
        best_length = length
    
    if best_idx < 0:
        return None, np.inf, 0.0
    
    best_point = valid_refl[best_idx]
    
    # Reflection loss (simplified: 3-10 dB depending on angle)
    # More grazing angle = more loss
    direct_dist = np.linalg.norm(rx - tx)
    path_ratio = best_length / max(direct_dist, 1.0)
    reflection_loss = 3.0 + 7.0 * min(1.0, (path_ratio - 1.0) * 2)
    
    return best_point, best_length, reflection_loss
```

---

### MODIFY: `backend/app/services/dominant_path_service.py`

Replace loop-based calculations with vectorized versions:

```python
# Add imports at top:
from .geometry_vectorized import (
    haversine_batch,
    points_to_local_coords,
    line_intersects_polygons_batch,
    find_best_reflection_path_vectorized
)

# Add helper to convert buildings to numpy arrays:
def _buildings_to_arrays(buildings: list, ref_lat: float, ref_lon: float):
    """Convert building list to numpy arrays for vectorized ops."""
    if not buildings:
        return None, None, None, None, None
    
    # Extract centroids
    lats = np.array([b.get('centroid_lat', b.get('lat', 0)) for b in buildings])
    lons = np.array([b.get('centroid_lon', b.get('lon', 0)) for b in buildings])
    
    # Convert to local coords
    x, y = points_to_local_coords(ref_lat, ref_lon, lats, lons)
    
    # Extract all walls (polygon edges)
    all_walls_start = []
    all_walls_end = []
    wall_to_building = []
    
    # Flatten polygons for intersection tests
    all_poly_x = []
    all_poly_y = []
    poly_lengths = []
    
    for i, b in enumerate(buildings):
        coords = b.get('geometry', {}).get('coordinates', [[]])[0]
        if len(coords) < 3:
            poly_lengths.append(0)
            continue
        
        # Convert polygon to local coords
        poly_lats = np.array([c[1] for c in coords])
        poly_lons = np.array([c[0] for c in coords])
        px, py = points_to_local_coords(ref_lat, ref_lon, poly_lats, poly_lons)
        
        all_poly_x.extend(px)
        all_poly_y.extend(py)
        poly_lengths.append(len(coords))
        
        # Extract walls
        for j in range(len(coords) - 1):
            all_walls_start.append([px[j], py[j]])
            all_walls_end.append([px[j+1], py[j+1]])
            wall_to_building.append(i)
    
    return (
        np.array(all_walls_start) if all_walls_start else np.zeros((0, 2)),
        np.array(all_walls_end) if all_walls_end else np.zeros((0, 2)),
        np.array(wall_to_building) if wall_to_building else np.zeros(0, dtype=int),
        np.array(all_poly_x),
        np.array(all_poly_y),
        np.array(poly_lengths)
    )


# Update main function to use vectorized operations:
def find_dominant_path_vectorized(
    tx_lat: float, tx_lon: float,
    rx_lat: float, rx_lon: float,
    buildings: list,
    frequency_mhz: float = 1800
) -> dict:
    """
    Find dominant propagation path using vectorized NumPy operations.
    
    Returns dict with:
        - has_los: bool
        - path_type: 'direct' | 'reflection' | 'diffraction'
        - total_loss: float (dB)
        - reflection_point: [lat, lon] or None
    """
    # Reference point for local coords (midpoint)
    ref_lat = (tx_lat + rx_lat) / 2
    ref_lon = (tx_lon + rx_lon) / 2
    
    # Convert TX/RX to local coords
    tx_x, tx_y = points_to_local_coords(ref_lat, ref_lon, 
                                         np.array([tx_lat]), np.array([tx_lon]))
    rx_x, rx_y = points_to_local_coords(ref_lat, ref_lon,
                                         np.array([rx_lat]), np.array([rx_lon]))
    tx = np.array([tx_x[0], tx_y[0]])
    rx = np.array([rx_x[0], rx_y[0]])
    
    # Convert buildings to arrays
    (walls_start, walls_end, wall_to_bldg,
     poly_x, poly_y, poly_lengths) = _buildings_to_arrays(buildings, ref_lat, ref_lon)
    
    direct_dist = np.linalg.norm(rx - tx)
    
    # Step 1: Check direct LOS
    if len(poly_lengths) == 0:
        # No buildings, direct LOS
        return {
            'has_los': True,
            'path_type': 'direct',
            'total_loss': 0.0,
            'reflection_point': None,
            'path_length': direct_dist
        }
    
    intersects, _ = line_intersects_polygons_batch(tx, rx, poly_x, poly_y, poly_lengths)
    
    if not np.any(intersects):
        # Direct LOS exists
        return {
            'has_los': True,
            'path_type': 'direct',
            'total_loss': 0.0,
            'reflection_point': None,
            'path_length': direct_dist
        }
    
    # Step 2: Find best reflection path
    refl_point, refl_length, refl_loss = find_best_reflection_path_vectorized(
        tx, rx, walls_start, walls_end, wall_to_bldg,
        poly_x, poly_y, poly_lengths,
        max_candidates=50
    )
    
    if refl_point is not None:
        # Convert reflection point back to lat/lon
        refl_lat = ref_lat + refl_point[1] / 110540
        refl_lon = ref_lon + refl_point[0] / (111320 * np.cos(np.radians(ref_lat)))
        
        return {
            'has_los': False,
            'path_type': 'reflection',
            'total_loss': refl_loss,
            'reflection_point': [refl_lat, refl_lon],
            'path_length': refl_length
        }
    
    # Step 3: Fallback to diffraction (simplified)
    # Count blocking buildings for diffraction loss estimate
    num_blocking = np.sum(intersects)
    diffraction_loss = 10 + 5 * min(num_blocking, 5)  # 10-35 dB
    
    return {
        'has_los': False,
        'path_type': 'diffraction',
        'total_loss': diffraction_loss,
        'reflection_point': None,
        'path_length': direct_dist  # Approximate
    }
```

---

### MODIFY: `backend/app/services/coverage_service.py`

Update `_calculate_point_sync()` to use vectorized dominant path:

```python
# In _calculate_point_sync(), replace dominant_path call:

if use_dominant_path and buildings:
    from .dominant_path_service import find_dominant_path_vectorized
    
    dominant = find_dominant_path_vectorized(
        tx_lat=site['lat'],
        tx_lon=site['lon'],
        rx_lat=point_lat,
        rx_lon=point_lon,
        buildings=buildings,
        frequency_mhz=site.get('frequency', 1800)
    )
    
    if dominant['path_type'] == 'reflection':
        reflection_gain = max(0, 10 - dominant['total_loss'])  # Convert loss to gain
        building_loss = 0  # Reflection path avoids buildings
    elif dominant['path_type'] == 'diffraction':
        building_loss = dominant['total_loss']
        reflection_gain = 0
    else:
        # Direct LOS
        building_loss = 0
        reflection_gain = 0
```

---

## 🧪 Testing

```bash
# Run test script
.\test-coverage.bat

# Expected results:
# Fast: ~0.03s (unchanged)
# Standard: ~35-40s (unchanged)  
# Detailed: ~30-60s (was 300s timeout!) ← 5-10x faster
```

---

## ✅ Success Criteria

| Metric | Before | After |
|--------|--------|-------|
| Detailed time | 300s (timeout) | <90s |
| Detailed ms/point | 346ms | <50ms |
| Memory peak | ~7GB | ~3-4GB |
| Accuracy | Baseline | Similar ±2dB |

---

## 📝 Notes

1. **LOS checks still have a small loop** — but limited to top 50 candidates sorted by path length
2. **Building→array conversion** happens once per calculation, not per point
3. **Local coordinate system** avoids expensive lat/lon math in inner loops
4. **Reflection model simplified** — uses path ratio for loss estimate

---

## 🔜 Future Optimizations

If still slow after vectorization:
- Cache building arrays (don't reconvert every point)
- Use scipy.spatial.cKDTree for spatial queries
- GPU acceleration with CuPy (drop-in NumPy replacement)