Installation

This guide covers installing L-SURF and setting up GPU acceleration for high-performance ray tracing.

Prerequisites

NVIDIA GPU (Recommended)

For GPU-accelerated simulations, you need:

NVIDIA GPU with Compute Capability 3.5+ (most GPUs from 2012 onwards)
NVIDIA Driver version 450 or later
CUDA Toolkit version 11.0 or later

Note

L-SURF works without a GPU using CPU fallback, but simulations will be 10-100x slower.

Check Your System

Before installing, verify your GPU setup:

# Check NVIDIA driver installation and GPU info
nvidia-smi

# Check CUDA toolkit version (if installed)
nvcc --version

Example output from nvidia-smi:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05   Driver Version: 535.154.05   CUDA Version: 12.2    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   45C    P8    10W / 250W |    512MiB / 11264MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

Python Requirements

Python >= 3.13
conda/mamba (recommended) or pip

Dependencies

Category	Packages	Notes
Core	numpy >= 1.24, matplotlib >= 3.7, pydantic >= 2.0	Required for all functionality
GPU	numba >= 0.58, CUDA Toolkit >= 11.0	Required for GPU acceleration
Optional	h5py >= 3.8, astropy-healpix >= 1.0	HDF5 support, spherical analysis
Development	pytest, black, ruff, mypy, pre-commit	For development and testing

Installation Methods

Recommended: Conda Environment

This method handles all dependencies automatically:

# 1. Clone the repository
git clone https://github.com/your-org/lsurf.git
cd lsurf

# 2. Create the conda environment
conda env create -f environment.yml

# 3. Activate the environment
conda activate lsurf

# 4. Verify installation
python -c "import lsurf; print('L-SURF installed successfully')"

Alternative: pip

If you prefer pip without conda:

git clone https://github.com/your-org/lsurf.git
cd lsurf
pip install -e ".[dev]"

GPU Setup

L-SURF uses Numba for GPU acceleration via CUDA. There are two approaches to set up CUDA:

Option 1: System CUDA (Recommended)

Install NVIDIA drivers and CUDA toolkit at the system level. This is the most reliable method.

Ubuntu/Debian:

# Install NVIDIA driver
sudo apt update
sudo apt install nvidia-driver-535  # Use latest available version

# Install CUDA toolkit
# Download from: https://developer.nvidia.com/cuda-downloads
# Or use the package manager:
sudo apt install nvidia-cuda-toolkit

Fedora:

# Install NVIDIA driver (RPM Fusion required)
sudo dnf install akmod-nvidia

# Install CUDA toolkit
sudo dnf install cuda

Arch Linux:

sudo pacman -S nvidia cuda

After installation, add CUDA to your PATH (add to ~/.bashrc):

export CUDA_HOME=/usr/local/cuda
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

Option 2: Conda CUDA

Install CUDA toolkit via conda-forge (useful for isolated environments):

conda activate lsurf
conda install -c conda-forge cudatoolkit=12.0

Warning

Conda CUDA requires matching versions between cudatoolkit and your NVIDIA driver. Check compatibility at CUDA Toolkit Release Notes.

Verifying GPU Setup

After installation, verify GPU acceleration works:

from numba import cuda

# Check CUDA availability
print(f"CUDA available: {cuda.is_available()}")

if cuda.is_available():
    gpu = cuda.get_current_device()
    print(f"GPU: {gpu.name}")
    print(f"Compute Capability: {gpu.compute_capability}")
    print(f"Total Memory: {gpu.total_memory / 1e9:.1f} GB")

    # Test a simple kernel
    @cuda.jit
    def test_kernel(arr):
        i = cuda.grid(1)
        if i < arr.size:
            arr[i] *= 2

    import numpy as np
    test_arr = cuda.to_device(np.ones(1000, dtype=np.float32))
    test_kernel[10, 100](test_arr)
    print("GPU kernel test: PASSED")
else:
    print("GPU not available - will use CPU fallback")

Run a Quick Benchmark

Test GPU performance with a simple ray tracing simulation:

import lsurf as sr
import time

# Create a simple surface and source
surface = sr.create_planar_surface(
    point=(0, 0, 0),
    normal=(0, 0, 1),
)

source = sr.CollimatedBeam(
    center=(0, 0, 1),
    direction=(0, 0, -1),
    radius=0.1,
    num_rays=100000,
    wavelength=532e-9,
)

rays = source.generate()

# Time the intersection calculation
start = time.perf_counter()
distances, hit_mask = surface.intersect(rays.positions, rays.directions)
elapsed = time.perf_counter() - start

print(f"Intersected {rays.num_rays:,} rays in {elapsed*1000:.1f} ms")
print(f"Throughput: {rays.num_rays/elapsed/1e6:.1f} million rays/second")

Typical performance:

GPU (RTX 3080): ~500 million rays/second
CPU (8-core): ~5 million rays/second

Troubleshooting

CUDA Not Available

If cuda.is_available() returns False:

Check NVIDIA driver:
```
nvidia-smi
```
If this fails, install/reinstall NVIDIA drivers.
Check CUDA toolkit:
```
nvcc --version
```
If this fails, install CUDA toolkit or set CUDA_HOME.
Verify GPU compute capability:

Your GPU must have Compute Capability >= 3.5. Check at CUDA GPUs.

Reinstall numba:

pip install --upgrade --force-reinstall numba

Conda Package Not Found

If you see PackagesNotFoundError:

# Update conda
conda update -n base conda

# Clear cache and retry
conda clean --all
conda env create -f environment.yml

Out of GPU Memory

If you see CudaAPIError: Out of memory:

Reduce num_rays in your simulation
Use batched processing for large simulations
Close other GPU applications
Check memory usage with nvidia-smi

# Process rays in batches
batch_size = 100000
for i in range(0, total_rays, batch_size):
    batch = rays[i:i+batch_size]
    # Process batch...

Slow Performance

If simulations are slower than expected:

Verify GPU is being used:

from numba import cuda
print(cuda.is_available())  # Should be True

Monitor GPU utilization during simulation:
```
watch -n 0.5 nvidia-smi
```

Increase ray count - GPUs perform better with more parallel work:

# Too few rays - GPU underutilized
source = sr.CollimatedBeam(num_rays=1000)  # Bad

# Better GPU utilization
source = sr.CollimatedBeam(num_rays=100000)  # Good

Check for Numba warnings about suboptimal grid sizes.

Import Errors

If you get ModuleNotFoundError: No module named 'lsurf':

# Verify installation
pip show lsurf

# Reinstall if needed
pip install -e ".[dev]"

Platform-Specific Notes

Windows

Install NVIDIA drivers from NVIDIA Driver Downloads
Install CUDA Toolkit from CUDA Downloads
Use Anaconda/Miniconda for Python environment management

macOS

CUDA is not supported on macOS (Apple Silicon or Intel)
L-SURF will use CPU-only mode automatically
Performance will be limited compared to NVIDIA GPU systems

WSL2 (Windows Subsystem for Linux)

GPU passthrough works with WSL2:

Install latest NVIDIA Windows driver (supports WSL2 GPU)
Do not install CUDA inside WSL2 - it uses Windows driver
Install L-SURF normally inside WSL2

# Inside WSL2
nvidia-smi  # Should show your Windows GPU
conda env create -f environment.yml
conda activate lsurf