Installation

This guide covers installing L-SURF and setting up GPU acceleration for high-performance ray tracing.

Prerequisites

Check Your System

Before installing, verify your GPU setup:

# Check NVIDIA driver installation and GPU info
nvidia-smi

# Check CUDA toolkit version (if installed)
nvcc --version

Example output from nvidia-smi:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05   Driver Version: 535.154.05   CUDA Version: 12.2    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   45C    P8    10W / 250W |    512MiB / 11264MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

Python Requirements

  • Python >= 3.13

  • conda/mamba (recommended) or pip

Dependencies

Category

Packages

Notes

Core

numpy >= 1.24, matplotlib >= 3.7, pydantic >= 2.0

Required for all functionality

GPU

numba >= 0.58, CUDA Toolkit >= 11.0

Required for GPU acceleration

Optional

h5py >= 3.8, astropy-healpix >= 1.0

HDF5 support, spherical analysis

Development

pytest, black, ruff, mypy, pre-commit

For development and testing

Installation Methods

Alternative: pip

If you prefer pip without conda:

git clone https://github.com/your-org/lsurf.git
cd lsurf
pip install -e ".[dev]"

GPU Setup

L-SURF uses Numba for GPU acceleration via CUDA. There are two approaches to set up CUDA:

Option 2: Conda CUDA

Install CUDA toolkit via conda-forge (useful for isolated environments):

conda activate lsurf
conda install -c conda-forge cudatoolkit=12.0

Warning

Conda CUDA requires matching versions between cudatoolkit and your NVIDIA driver. Check compatibility at CUDA Toolkit Release Notes.

Verifying GPU Setup

After installation, verify GPU acceleration works:

from numba import cuda

# Check CUDA availability
print(f"CUDA available: {cuda.is_available()}")

if cuda.is_available():
    gpu = cuda.get_current_device()
    print(f"GPU: {gpu.name}")
    print(f"Compute Capability: {gpu.compute_capability}")
    print(f"Total Memory: {gpu.total_memory / 1e9:.1f} GB")

    # Test a simple kernel
    @cuda.jit
    def test_kernel(arr):
        i = cuda.grid(1)
        if i < arr.size:
            arr[i] *= 2

    import numpy as np
    test_arr = cuda.to_device(np.ones(1000, dtype=np.float32))
    test_kernel[10, 100](test_arr)
    print("GPU kernel test: PASSED")
else:
    print("GPU not available - will use CPU fallback")

Run a Quick Benchmark

Test GPU performance with a simple ray tracing simulation:

import lsurf as sr
import time

# Create a simple surface and source
surface = sr.create_planar_surface(
    point=(0, 0, 0),
    normal=(0, 0, 1),
)

source = sr.CollimatedBeam(
    center=(0, 0, 1),
    direction=(0, 0, -1),
    radius=0.1,
    num_rays=100000,
    wavelength=532e-9,
)

rays = source.generate()

# Time the intersection calculation
start = time.perf_counter()
distances, hit_mask = surface.intersect(rays.positions, rays.directions)
elapsed = time.perf_counter() - start

print(f"Intersected {rays.num_rays:,} rays in {elapsed*1000:.1f} ms")
print(f"Throughput: {rays.num_rays/elapsed/1e6:.1f} million rays/second")

Typical performance:

  • GPU (RTX 3080): ~500 million rays/second

  • CPU (8-core): ~5 million rays/second

Troubleshooting

CUDA Not Available

If cuda.is_available() returns False:

  1. Check NVIDIA driver:

    nvidia-smi
    

    If this fails, install/reinstall NVIDIA drivers.

  2. Check CUDA toolkit:

    nvcc --version
    

    If this fails, install CUDA toolkit or set CUDA_HOME.

  3. Verify GPU compute capability:

    Your GPU must have Compute Capability >= 3.5. Check at CUDA GPUs.

  4. Reinstall numba:

    pip install --upgrade --force-reinstall numba
    

Conda Package Not Found

If you see PackagesNotFoundError:

# Update conda
conda update -n base conda

# Clear cache and retry
conda clean --all
conda env create -f environment.yml

Out of GPU Memory

If you see CudaAPIError: Out of memory:

  1. Reduce num_rays in your simulation

  2. Use batched processing for large simulations

  3. Close other GPU applications

  4. Check memory usage with nvidia-smi

# Process rays in batches
batch_size = 100000
for i in range(0, total_rays, batch_size):
    batch = rays[i:i+batch_size]
    # Process batch...

Slow Performance

If simulations are slower than expected:

  1. Verify GPU is being used:

    from numba import cuda
    print(cuda.is_available())  # Should be True
    
  2. Monitor GPU utilization during simulation:

    watch -n 0.5 nvidia-smi
    
  3. Increase ray count - GPUs perform better with more parallel work:

    # Too few rays - GPU underutilized
    source = sr.CollimatedBeam(num_rays=1000)  # Bad
    
    # Better GPU utilization
    source = sr.CollimatedBeam(num_rays=100000)  # Good
    
  4. Check for Numba warnings about suboptimal grid sizes.

Import Errors

If you get ModuleNotFoundError: No module named 'lsurf':

# Verify installation
pip show lsurf

# Reinstall if needed
pip install -e ".[dev]"

Platform-Specific Notes

Windows

macOS

  • CUDA is not supported on macOS (Apple Silicon or Intel)

  • L-SURF will use CPU-only mode automatically

  • Performance will be limited compared to NVIDIA GPU systems

WSL2 (Windows Subsystem for Linux)

GPU passthrough works with WSL2:

  1. Install latest NVIDIA Windows driver (supports WSL2 GPU)

  2. Do not install CUDA inside WSL2 - it uses Windows driver

  3. Install L-SURF normally inside WSL2

# Inside WSL2
nvidia-smi  # Should show your Windows GPU
conda env create -f environment.yml
conda activate lsurf