stochastic-rs
Concepts

Sampling backends

Compile-time GPU / accelerator backend selection for the fractional family via `.on::<B>()` — the `Backend` trait, its markers, and zero-cost dispatch.

Sampling backends

The fractional / fGN process family selects where its FFT-based sampling runs — CPU, CUDA, Metal, or Apple Accelerate — at compile time. A process is parameterised by a backend marker B (defaulting to Cpu); the Backend trait monomorphises sample / sample_par to that backend with no runtime branch.

This replaces the ad-hoc per-backend methods (sample_cuda_native, sample_gpu, …) with one uniform selector. The low-level methods still exist, but .on::<B>() is the recommended API.

Switching backend

Re-type the process with the .on::<B>() turbofish, then sample as usual:

use stochastic_rs::stochastic::device::CudaNative;
use stochastic_rs::stochastic::noise::fgn::Fgn;
use stochastic_rs::simd_rng::Unseeded;

let cpu = Fgn::<f32, _>::new(0.7, 65_536, None, Unseeded).sample(); // default Cpu

let gpu = Fgn::<f32, _>::new(0.7, 65_536, None, Unseeded)
    .on::<CudaNative>()
    .sample();                                  // same call, runs on the GPU

.on::<B>() consumes the process and returns it re-typed to backend B — the fields are moved, nothing is copied, and the choice is resolved at compile time.

The Backend trait

MarkerFeatureBacks ontoHardware
Cpu— (always available, default)ndrustfft + CPU SIMD, rayon batchesany CPU
CudaNativecuda-nativecudarc + cuFFT, JIT Philox RNGNVIDIA, CUDA 12.x
CubeClgpu / gpu-cuda / gpu-wgpucubecl portable kernelsNVIDIA / AMD / wgpu
MetalNativemetalhand-written MSL via the metal crateApple GPU (f32)
AccelerateaccelerateApple vDSP FFT (CPU), rayon batchesmacOS

Backend and Cpu are re-exported at stochastic_rs::traits; the GPU markers live under stochastic_rs::stochastic::device and only exist when their feature is compiled. Selecting an unavailable backend is therefore a compile error, not a silent runtime fallback.

use stochastic_rs::traits::{Backend, Cpu};          // always available
#[cfg(feature = "metal")]
use stochastic_rs::stochastic::device::MetalNative;  // feature-gated marker

Which processes support it

The backend type parameter is carried by the fractional / fGN family: Fgn, Fbm, Fou, Fcir, Fgbm, FJacobi, Cfou, Cfgns, JumpFou, JumpFOUCustom, and the generic Sde. These are the processes whose hot path is a circulant-embedding FFT, which is exactly what the GPU / accelerator backends speed up. Non-fractional processes sample on the CPU only.

Precision

Apple GPUs have no f64, so MetalNative is f32 only. CudaNative, CubeCl, and Accelerate support both f32 and f64; the Cpu backend follows the process's T parameter.

When is GPU worth it?

The discrete-GPU win is real but narrow — it shows up for large n (≥ 16 k) and batched sampling, while for small n the CPU SIMD path wins on launch latency. See the Benchmarks page for the cross-over numbers, and the Feature flags page for how to enable each backend.

On this page