# RFCP Phase 2.5.0: NumPy Vectorization **Date:** February 1, 2025 **Type:** Performance Optimization **Priority:** HIGH **Goal:** 10-50x speedup for Detailed preset via NumPy vectorization --- ## 🎯 Problem Detailed preset: **346ms/point** β€” way too slow, causes 5 min timeout. Root cause: Python loops in dominant_path_service.py ```python for building in buildings: # 50 iterations for reflector in reflectors: # 30 iterations # Math operations... ``` **1500 Python loop iterations** with function calls = slow. --- ## πŸš€ Solution: NumPy Vectorization Replace Python loops with NumPy batch operations: - **Before:** 1500 function calls, 1500 loop iterations - **After:** ~10 function calls, 0 Python loops - **Expected speedup:** 10-50x --- ## πŸ“ Files to Create/Modify ### NEW: `backend/app/services/geometry_vectorized.py` Core vectorized geometry functions: ```python """ Vectorized geometry operations using NumPy. All functions operate on arrays, not single values. """ import numpy as np from typing import Tuple, Optional # Earth radius in meters EARTH_RADIUS = 6371000 def haversine_batch( lat1: float, lon1: float, lats2: np.ndarray, lons2: np.ndarray ) -> np.ndarray: """ Calculate distances from one point to many points. Args: lat1, lon1: Single origin point (degrees) lats2, lons2: Arrays of destination points (degrees), shape (N,) Returns: distances: Array of distances in meters, shape (N,) """ lat1_rad = np.radians(lat1) lon1_rad = np.radians(lon1) lats2_rad = np.radians(lats2) lons2_rad = np.radians(lons2) dlat = lats2_rad - lat1_rad dlon = lons2_rad - lon1_rad a = np.sin(dlat / 2) ** 2 + np.cos(lat1_rad) * np.cos(lats2_rad) * np.sin(dlon / 2) ** 2 c = 2 * np.arcsin(np.sqrt(a)) return EARTH_RADIUS * c def haversine_matrix( lats1: np.ndarray, lons1: np.ndarray, lats2: np.ndarray, lons2: np.ndarray ) -> np.ndarray: """ Calculate distances between all pairs of points (MΓ—N matrix). Args: lats1, lons1: First set of points, shape (M,) lats2, lons2: Second set of points, shape (N,) Returns: distances: Matrix of distances, shape (M, N) """ # Reshape for broadcasting: (M, 1) and (1, N) lats1_rad = np.radians(lats1[:, np.newaxis]) lons1_rad = np.radians(lons1[:, np.newaxis]) lats2_rad = np.radians(lats2[np.newaxis, :]) lons2_rad = np.radians(lons2[np.newaxis, :]) dlat = lats2_rad - lats1_rad dlon = lons2_rad - lons1_rad a = np.sin(dlat / 2) ** 2 + np.cos(lats1_rad) * np.cos(lats2_rad) * np.sin(dlon / 2) ** 2 c = 2 * np.arcsin(np.sqrt(a)) return EARTH_RADIUS * c def points_to_local_coords( ref_lat: float, ref_lon: float, lats: np.ndarray, lons: np.ndarray ) -> Tuple[np.ndarray, np.ndarray]: """ Convert lat/lon to local X/Y coordinates (meters from reference). Uses simple equirectangular projection (good for small areas). Args: ref_lat, ref_lon: Reference point (degrees) lats, lons: Points to convert, shape (N,) Returns: x, y: Local coordinates in meters, shape (N,) """ cos_lat = np.cos(np.radians(ref_lat)) x = (lons - ref_lon) * 111320 * cos_lat y = (lats - ref_lat) * 110540 return x, y def line_segments_intersect_batch( p1: np.ndarray, p2: np.ndarray, segments_start: np.ndarray, segments_end: np.ndarray ) -> Tuple[np.ndarray, np.ndarray]: """ Check if line segment p1β†’p2 intersects with multiple segments. Uses vectorized cross-product method. Args: p1: Start point (2,) β€” [x, y] p2: End point (2,) β€” [x, y] segments_start: Segment start points (N, 2) segments_end: Segment end points (N, 2) Returns: intersects: Boolean array (N,) β€” True if intersects t_values: Parameter values (N,) β€” where along p1β†’p2 intersection occurs """ d = p2 - p1 # Direction of main line # Segment directions seg_d = segments_end - segments_start # (N, 2) # Cross product for parallel check cross = d[0] * seg_d[:, 1] - d[1] * seg_d[:, 0] # (N,) # Avoid division by zero parallel_mask = np.abs(cross) < 1e-10 cross_safe = np.where(parallel_mask, 1.0, cross) # Vector from segment start to p1 dp = p1 - segments_start # (N, 2) # Calculate t (parameter on main line) and u (parameter on segments) t = (dp[:, 0] * seg_d[:, 1] - dp[:, 1] * seg_d[:, 0]) / cross_safe u = (dp[:, 0] * d[1] - dp[:, 1] * d[0]) / cross_safe # Intersection if 0 <= t <= 1 and 0 <= u <= 1 intersects = ~parallel_mask & (t >= 0) & (t <= 1) & (u >= 0) & (u <= 1) return intersects, t def line_intersects_polygons_batch( p1: np.ndarray, p2: np.ndarray, polygons_x: np.ndarray, polygons_y: np.ndarray, polygon_lengths: np.ndarray ) -> Tuple[np.ndarray, np.ndarray]: """ Check if line p1β†’p2 intersects multiple polygons. Args: p1: Start point (2,) β€” [x, y] p2: End point (2,) β€” [x, y] polygons_x: Flattened polygon X coords (total_vertices,) polygons_y: Flattened polygon Y coords (total_vertices,) polygon_lengths: Number of vertices per polygon (num_polygons,) Returns: intersects: Boolean array (num_polygons,) min_distances: Distance to first intersection (num_polygons,) """ num_polygons = len(polygon_lengths) intersects = np.zeros(num_polygons, dtype=bool) min_t = np.ones(num_polygons) * np.inf # Process each polygon idx = 0 for i, length in enumerate(polygon_lengths): if length < 3: idx += length continue # Get polygon vertices px = polygons_x[idx:idx + length] py = polygons_y[idx:idx + length] # Create edge segments (including closing edge) starts = np.stack([px, py], axis=1) # (length, 2) ends = np.stack([np.roll(px, -1), np.roll(py, -1)], axis=1) # (length, 2) # Check intersections with all edges edge_intersects, t_vals = line_segments_intersect_batch(p1, p2, starts, ends) if np.any(edge_intersects): intersects[i] = True min_t[i] = np.min(t_vals[edge_intersects]) idx += length # Convert t to distance line_length = np.linalg.norm(p2 - p1) min_distances = min_t * line_length return intersects, min_distances def calculate_reflection_points_batch( tx: np.ndarray, rx: np.ndarray, wall_starts: np.ndarray, wall_ends: np.ndarray ) -> Tuple[np.ndarray, np.ndarray]: """ Calculate reflection points on multiple walls. Uses mirror image method. Args: tx: Transmitter position (2,) β€” [x, y] rx: Receiver position (2,) β€” [x, y] wall_starts: Wall start points (N, 2) wall_ends: Wall end points (N, 2) Returns: reflection_points: Reflection point on each wall (N, 2) valid: Boolean mask for valid reflections (N,) """ # Wall vectors and normals wall_vec = wall_ends - wall_starts # (N, 2) wall_length = np.linalg.norm(wall_vec, axis=1, keepdims=True) wall_unit = wall_vec / np.maximum(wall_length, 1e-10) # (N, 2) # Normal vectors (perpendicular to wall) normals = np.stack([-wall_unit[:, 1], wall_unit[:, 0]], axis=1) # (N, 2) # Mirror TX across each wall tx_to_wall = tx - wall_starts # (N, 2) tx_dist_to_wall = np.sum(tx_to_wall * normals, axis=1, keepdims=True) # (N, 1) tx_mirror = tx - 2 * tx_dist_to_wall * normals # (N, 2) # Find intersection of rxβ†’tx_mirror with wall # Parametric: wall_start + t * wall_vec = rx + s * (tx_mirror - rx) rx_to_mirror = tx_mirror - rx # (N, 2) # Solve for t (position along wall) cross_denom = (rx_to_mirror[:, 0] * wall_vec[:, 1] - rx_to_mirror[:, 1] * wall_vec[:, 0]) # Avoid division by zero valid_denom = np.abs(cross_denom) > 1e-10 cross_denom_safe = np.where(valid_denom, cross_denom, 1.0) rx_to_start = wall_starts - rx # (N, 2) t = (rx_to_start[:, 0] * rx_to_mirror[:, 1] - rx_to_start[:, 1] * rx_to_mirror[:, 0]) / cross_denom_safe # Reflection point reflection_points = wall_starts + t[:, np.newaxis] * wall_vec # Valid if t in [0, 1] and TX is on correct side of wall valid = valid_denom & (t >= 0) & (t <= 1) & (tx_dist_to_wall[:, 0] > 0) return reflection_points, valid def find_best_reflection_path_vectorized( tx: np.ndarray, rx: np.ndarray, building_walls_start: np.ndarray, building_walls_end: np.ndarray, wall_to_building: np.ndarray, obstacle_polygons_x: np.ndarray, obstacle_polygons_y: np.ndarray, obstacle_lengths: np.ndarray, max_candidates: int = 50 ) -> Tuple[Optional[np.ndarray], float, float]: """ Find best single-reflection path using vectorized operations. Args: tx: Transmitter [x, y] rx: Receiver [x, y] building_walls_start: All wall start points (W, 2) building_walls_end: All wall end points (W, 2) wall_to_building: Mapping wall index β†’ building index (W,) obstacle_*: Obstacle polygons for LOS checks max_candidates: Max reflection candidates to evaluate Returns: best_reflection_point: [x, y] or None best_path_length: Total path length best_reflection_loss: dB loss from reflection """ num_walls = len(building_walls_start) if num_walls == 0: return None, np.inf, 0.0 # Step 1: Calculate all reflection points at once refl_points, valid = calculate_reflection_points_batch( tx, rx, building_walls_start, building_walls_end ) if not np.any(valid): return None, np.inf, 0.0 # Step 2: Calculate path lengths for valid reflections valid_indices = np.where(valid)[0] valid_refl = refl_points[valid] # (V, 2) # TX β†’ reflection distances tx_to_refl = np.linalg.norm(valid_refl - tx, axis=1) # (V,) # Reflection β†’ RX distances refl_to_rx = np.linalg.norm(rx - valid_refl, axis=1) # (V,) # Total path lengths path_lengths = tx_to_refl + refl_to_rx # (V,) # Step 3: Sort by path length, take top candidates if len(valid_indices) > max_candidates: top_indices = np.argpartition(path_lengths, max_candidates)[:max_candidates] valid_indices = valid_indices[top_indices] valid_refl = valid_refl[top_indices] path_lengths = path_lengths[top_indices] tx_to_refl = tx_to_refl[top_indices] refl_to_rx = refl_to_rx[top_indices] # Step 4: Check LOS for each candidate (this is still a loop, but limited) best_idx = -1 best_length = np.inf for i, (refl_pt, length) in enumerate(zip(valid_refl, path_lengths)): # Skip if already longer than best if length >= best_length: continue # Check TX β†’ reflection LOS intersects1, _ = line_intersects_polygons_batch( tx, refl_pt, obstacle_polygons_x, obstacle_polygons_y, obstacle_lengths ) if np.any(intersects1): continue # Check reflection β†’ RX LOS intersects2, _ = line_intersects_polygons_batch( refl_pt, rx, obstacle_polygons_x, obstacle_polygons_y, obstacle_lengths ) if np.any(intersects2): continue # Valid path found best_idx = i best_length = length if best_idx < 0: return None, np.inf, 0.0 best_point = valid_refl[best_idx] # Reflection loss (simplified: 3-10 dB depending on angle) # More grazing angle = more loss direct_dist = np.linalg.norm(rx - tx) path_ratio = best_length / max(direct_dist, 1.0) reflection_loss = 3.0 + 7.0 * min(1.0, (path_ratio - 1.0) * 2) return best_point, best_length, reflection_loss ``` --- ### MODIFY: `backend/app/services/dominant_path_service.py` Replace loop-based calculations with vectorized versions: ```python # Add imports at top: from .geometry_vectorized import ( haversine_batch, points_to_local_coords, line_intersects_polygons_batch, find_best_reflection_path_vectorized ) # Add helper to convert buildings to numpy arrays: def _buildings_to_arrays(buildings: list, ref_lat: float, ref_lon: float): """Convert building list to numpy arrays for vectorized ops.""" if not buildings: return None, None, None, None, None # Extract centroids lats = np.array([b.get('centroid_lat', b.get('lat', 0)) for b in buildings]) lons = np.array([b.get('centroid_lon', b.get('lon', 0)) for b in buildings]) # Convert to local coords x, y = points_to_local_coords(ref_lat, ref_lon, lats, lons) # Extract all walls (polygon edges) all_walls_start = [] all_walls_end = [] wall_to_building = [] # Flatten polygons for intersection tests all_poly_x = [] all_poly_y = [] poly_lengths = [] for i, b in enumerate(buildings): coords = b.get('geometry', {}).get('coordinates', [[]])[0] if len(coords) < 3: poly_lengths.append(0) continue # Convert polygon to local coords poly_lats = np.array([c[1] for c in coords]) poly_lons = np.array([c[0] for c in coords]) px, py = points_to_local_coords(ref_lat, ref_lon, poly_lats, poly_lons) all_poly_x.extend(px) all_poly_y.extend(py) poly_lengths.append(len(coords)) # Extract walls for j in range(len(coords) - 1): all_walls_start.append([px[j], py[j]]) all_walls_end.append([px[j+1], py[j+1]]) wall_to_building.append(i) return ( np.array(all_walls_start) if all_walls_start else np.zeros((0, 2)), np.array(all_walls_end) if all_walls_end else np.zeros((0, 2)), np.array(wall_to_building) if wall_to_building else np.zeros(0, dtype=int), np.array(all_poly_x), np.array(all_poly_y), np.array(poly_lengths) ) # Update main function to use vectorized operations: def find_dominant_path_vectorized( tx_lat: float, tx_lon: float, rx_lat: float, rx_lon: float, buildings: list, frequency_mhz: float = 1800 ) -> dict: """ Find dominant propagation path using vectorized NumPy operations. Returns dict with: - has_los: bool - path_type: 'direct' | 'reflection' | 'diffraction' - total_loss: float (dB) - reflection_point: [lat, lon] or None """ # Reference point for local coords (midpoint) ref_lat = (tx_lat + rx_lat) / 2 ref_lon = (tx_lon + rx_lon) / 2 # Convert TX/RX to local coords tx_x, tx_y = points_to_local_coords(ref_lat, ref_lon, np.array([tx_lat]), np.array([tx_lon])) rx_x, rx_y = points_to_local_coords(ref_lat, ref_lon, np.array([rx_lat]), np.array([rx_lon])) tx = np.array([tx_x[0], tx_y[0]]) rx = np.array([rx_x[0], rx_y[0]]) # Convert buildings to arrays (walls_start, walls_end, wall_to_bldg, poly_x, poly_y, poly_lengths) = _buildings_to_arrays(buildings, ref_lat, ref_lon) direct_dist = np.linalg.norm(rx - tx) # Step 1: Check direct LOS if len(poly_lengths) == 0: # No buildings, direct LOS return { 'has_los': True, 'path_type': 'direct', 'total_loss': 0.0, 'reflection_point': None, 'path_length': direct_dist } intersects, _ = line_intersects_polygons_batch(tx, rx, poly_x, poly_y, poly_lengths) if not np.any(intersects): # Direct LOS exists return { 'has_los': True, 'path_type': 'direct', 'total_loss': 0.0, 'reflection_point': None, 'path_length': direct_dist } # Step 2: Find best reflection path refl_point, refl_length, refl_loss = find_best_reflection_path_vectorized( tx, rx, walls_start, walls_end, wall_to_bldg, poly_x, poly_y, poly_lengths, max_candidates=50 ) if refl_point is not None: # Convert reflection point back to lat/lon refl_lat = ref_lat + refl_point[1] / 110540 refl_lon = ref_lon + refl_point[0] / (111320 * np.cos(np.radians(ref_lat))) return { 'has_los': False, 'path_type': 'reflection', 'total_loss': refl_loss, 'reflection_point': [refl_lat, refl_lon], 'path_length': refl_length } # Step 3: Fallback to diffraction (simplified) # Count blocking buildings for diffraction loss estimate num_blocking = np.sum(intersects) diffraction_loss = 10 + 5 * min(num_blocking, 5) # 10-35 dB return { 'has_los': False, 'path_type': 'diffraction', 'total_loss': diffraction_loss, 'reflection_point': None, 'path_length': direct_dist # Approximate } ``` --- ### MODIFY: `backend/app/services/coverage_service.py` Update `_calculate_point_sync()` to use vectorized dominant path: ```python # In _calculate_point_sync(), replace dominant_path call: if use_dominant_path and buildings: from .dominant_path_service import find_dominant_path_vectorized dominant = find_dominant_path_vectorized( tx_lat=site['lat'], tx_lon=site['lon'], rx_lat=point_lat, rx_lon=point_lon, buildings=buildings, frequency_mhz=site.get('frequency', 1800) ) if dominant['path_type'] == 'reflection': reflection_gain = max(0, 10 - dominant['total_loss']) # Convert loss to gain building_loss = 0 # Reflection path avoids buildings elif dominant['path_type'] == 'diffraction': building_loss = dominant['total_loss'] reflection_gain = 0 else: # Direct LOS building_loss = 0 reflection_gain = 0 ``` --- ## πŸ§ͺ Testing ```bash # Run test script .\test-coverage.bat # Expected results: # Fast: ~0.03s (unchanged) # Standard: ~35-40s (unchanged) # Detailed: ~30-60s (was 300s timeout!) ← 5-10x faster ``` --- ## βœ… Success Criteria | Metric | Before | After | |--------|--------|-------| | Detailed time | 300s (timeout) | <90s | | Detailed ms/point | 346ms | <50ms | | Memory peak | ~7GB | ~3-4GB | | Accuracy | Baseline | Similar Β±2dB | --- ## πŸ“ Notes 1. **LOS checks still have a small loop** β€” but limited to top 50 candidates sorted by path length 2. **Buildingβ†’array conversion** happens once per calculation, not per point 3. **Local coordinate system** avoids expensive lat/lon math in inner loops 4. **Reflection model simplified** β€” uses path ratio for loss estimate --- ## πŸ”œ Future Optimizations If still slow after vectorization: - Cache building arrays (don't reconvert every point) - Use scipy.spatial.cKDTree for spatial queries - GPU acceleration with CuPy (drop-in NumPy replacement)