Installation
This guide covers installing L-SURF and setting up GPU acceleration for high-performance ray tracing.
Prerequisites
NVIDIA GPU (Recommended)
For GPU-accelerated simulations, you need:
NVIDIA GPU with Compute Capability 3.5+ (most GPUs from 2012 onwards)
NVIDIA Driver version 450 or later
CUDA Toolkit version 11.0 or later
Note
L-SURF works without a GPU using CPU fallback, but simulations will be 10-100x slower.
Check Your System
Before installing, verify your GPU setup:
# Check NVIDIA driver installation and GPU info
nvidia-smi
# Check CUDA toolkit version (if installed)
nvcc --version
Example output from nvidia-smi:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 0% 45C P8 10W / 250W | 512MiB / 11264MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
Python Requirements
Python >= 3.13
conda/mamba (recommended) or pip
Dependencies
Category |
Packages |
Notes |
|---|---|---|
Core |
numpy >= 1.24, matplotlib >= 3.7, pydantic >= 2.0 |
Required for all functionality |
GPU |
numba >= 0.58, CUDA Toolkit >= 11.0 |
Required for GPU acceleration |
Optional |
h5py >= 3.8, astropy-healpix >= 1.0 |
HDF5 support, spherical analysis |
Development |
pytest, black, ruff, mypy, pre-commit |
For development and testing |
Installation Methods
Recommended: Conda Environment
This method handles all dependencies automatically:
# 1. Clone the repository
git clone https://github.com/your-org/lsurf.git
cd lsurf
# 2. Create the conda environment
conda env create -f environment.yml
# 3. Activate the environment
conda activate lsurf
# 4. Verify installation
python -c "import lsurf; print('L-SURF installed successfully')"
Alternative: pip
If you prefer pip without conda:
git clone https://github.com/your-org/lsurf.git
cd lsurf
pip install -e ".[dev]"
GPU Setup
L-SURF uses Numba for GPU acceleration via CUDA. There are two approaches to set up CUDA:
Option 1: System CUDA (Recommended)
Install NVIDIA drivers and CUDA toolkit at the system level. This is the most reliable method.
Ubuntu/Debian:
# Install NVIDIA driver
sudo apt update
sudo apt install nvidia-driver-535 # Use latest available version
# Install CUDA toolkit
# Download from: https://developer.nvidia.com/cuda-downloads
# Or use the package manager:
sudo apt install nvidia-cuda-toolkit
Fedora:
# Install NVIDIA driver (RPM Fusion required)
sudo dnf install akmod-nvidia
# Install CUDA toolkit
sudo dnf install cuda
Arch Linux:
sudo pacman -S nvidia cuda
After installation, add CUDA to your PATH (add to ~/.bashrc):
export CUDA_HOME=/usr/local/cuda
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
Option 2: Conda CUDA
Install CUDA toolkit via conda-forge (useful for isolated environments):
conda activate lsurf
conda install -c conda-forge cudatoolkit=12.0
Warning
Conda CUDA requires matching versions between cudatoolkit and your NVIDIA driver. Check compatibility at CUDA Toolkit Release Notes.
Verifying GPU Setup
After installation, verify GPU acceleration works:
from numba import cuda
# Check CUDA availability
print(f"CUDA available: {cuda.is_available()}")
if cuda.is_available():
gpu = cuda.get_current_device()
print(f"GPU: {gpu.name}")
print(f"Compute Capability: {gpu.compute_capability}")
print(f"Total Memory: {gpu.total_memory / 1e9:.1f} GB")
# Test a simple kernel
@cuda.jit
def test_kernel(arr):
i = cuda.grid(1)
if i < arr.size:
arr[i] *= 2
import numpy as np
test_arr = cuda.to_device(np.ones(1000, dtype=np.float32))
test_kernel[10, 100](test_arr)
print("GPU kernel test: PASSED")
else:
print("GPU not available - will use CPU fallback")
Run a Quick Benchmark
Test GPU performance with a simple ray tracing simulation:
import lsurf as sr
import time
# Create a simple surface and source
surface = sr.create_planar_surface(
point=(0, 0, 0),
normal=(0, 0, 1),
)
source = sr.CollimatedBeam(
center=(0, 0, 1),
direction=(0, 0, -1),
radius=0.1,
num_rays=100000,
wavelength=532e-9,
)
rays = source.generate()
# Time the intersection calculation
start = time.perf_counter()
distances, hit_mask = surface.intersect(rays.positions, rays.directions)
elapsed = time.perf_counter() - start
print(f"Intersected {rays.num_rays:,} rays in {elapsed*1000:.1f} ms")
print(f"Throughput: {rays.num_rays/elapsed/1e6:.1f} million rays/second")
Typical performance:
GPU (RTX 3080): ~500 million rays/second
CPU (8-core): ~5 million rays/second
Troubleshooting
CUDA Not Available
If cuda.is_available() returns False:
Check NVIDIA driver:
nvidia-smi
If this fails, install/reinstall NVIDIA drivers.
Check CUDA toolkit:
nvcc --version
If this fails, install CUDA toolkit or set
CUDA_HOME.Verify GPU compute capability:
Your GPU must have Compute Capability >= 3.5. Check at CUDA GPUs.
Reinstall numba:
pip install --upgrade --force-reinstall numba
Conda Package Not Found
If you see PackagesNotFoundError:
# Update conda
conda update -n base conda
# Clear cache and retry
conda clean --all
conda env create -f environment.yml
Out of GPU Memory
If you see CudaAPIError: Out of memory:
Reduce
num_raysin your simulationUse batched processing for large simulations
Close other GPU applications
Check memory usage with
nvidia-smi
# Process rays in batches
batch_size = 100000
for i in range(0, total_rays, batch_size):
batch = rays[i:i+batch_size]
# Process batch...
Slow Performance
If simulations are slower than expected:
Verify GPU is being used:
from numba import cuda print(cuda.is_available()) # Should be True
Monitor GPU utilization during simulation:
watch -n 0.5 nvidia-smi
Increase ray count - GPUs perform better with more parallel work:
# Too few rays - GPU underutilized source = sr.CollimatedBeam(num_rays=1000) # Bad # Better GPU utilization source = sr.CollimatedBeam(num_rays=100000) # Good
Check for Numba warnings about suboptimal grid sizes.
Import Errors
If you get ModuleNotFoundError: No module named 'lsurf':
# Verify installation
pip show lsurf
# Reinstall if needed
pip install -e ".[dev]"
Platform-Specific Notes
Windows
Install NVIDIA drivers from NVIDIA Driver Downloads
Install CUDA Toolkit from CUDA Downloads
Use Anaconda/Miniconda for Python environment management
macOS
CUDA is not supported on macOS (Apple Silicon or Intel)
L-SURF will use CPU-only mode automatically
Performance will be limited compared to NVIDIA GPU systems
WSL2 (Windows Subsystem for Linux)
GPU passthrough works with WSL2:
Install latest NVIDIA Windows driver (supports WSL2 GPU)
Do not install CUDA inside WSL2 - it uses Windows driver
Install L-SURF normally inside WSL2
# Inside WSL2
nvidia-smi # Should show your Windows GPU
conda env create -f environment.yml
conda activate lsurf