ProcessExt
The ProcessExt contract — sample, sample_map, sample_par — covering output shape, time grid, buffer reuse, and parallel determinism.
ProcessExt<T>
Every stochastic process in stochastic-rs-stochastic — diffusion, jump,
volatility, interest, rough, noise — implements ProcessExt<T>. The public
surface is three methods:
pub trait ProcessExt<T: FloatExt>: Send + Sync {
type Output: Send;
/// One sampled path.
fn sample(&self) -> Self::Output;
/// Map `f` over `m` independently sampled paths, in parallel.
fn sample_map<R: Send>(&self, m: usize, f: impl Fn(&Self::Output) -> R + Sync) -> Vec<R>;
/// `m` independently sampled paths, kept.
fn sample_par(&self, m: usize) -> Vec<Self::Output>;
}Under these sits a reusable per-thread sampler that owns the mutable sampling state (RNG, distribution buffers, precomputed scales). It is an implementation detail — you never name or construct it; the three methods above are the whole interface.
Output shape
Self::Output depends on the process family:
| Family | Output |
|---|---|
| Single-factor (OU, GBM, CIR, Vasicek, …) | Array1<T> |
| Multi-state (Heston, SABR, Bergomi, …) | [Array1<T>; N] |
| Term structure / sheet (HJM tenors, Bgm, Fbs) | Array2<T> |
| Variable-dimension (multivariate Hawkes) | Vec<Array1<T>> |
| Complex-state (cfOU) | Array1<Complex<T>> |
The dimensional marker traits (OneDimensional, TwoDimensional,
MultiDimensional<N>, CurveOutput, VariableDimensional,
ComplexPathOutput) are auto-implemented from Output, so generic code can
bound on the shape it expects.
Path length and time grid
Construction of any process takes (n, x0, t):
n: usize— number of steps (a single path is lengthn)x0: Option<T>— initial value;None⇒ a sensible default per process (e.g. 0 for OU, 1 for GBM)t: Option<T>— horizon;None⇒1.0
The time grid is uniform: . Non-uniform grids are handled by user code (sample at fine , then resample).
Choosing a method
use stochastic_rs::stochastic::diffusion::gbm::Gbm;
use stochastic_rs::simd_rng::Unseeded;
let gbm = Gbm::<f64, _>::new(0.05, 0.2, 64, Some(1.0), Some(1.0), Unseeded);
// one path
let path = gbm.sample();
// 1M paths, reduced to a payoff mean — no per-path allocation
let mean = gbm.sample_map(1_000_000, |p| (p.last().unwrap() - 1.0).max(0.0)).iter().sum::<f64>()
/ 1_000_000.0;
// 1k paths, kept (for plotting / path-dependent post-processing)
let paths = gbm.sample_par(1_000);| Method | Returns | Use when |
|---|---|---|
sample() | one Output | a single path, or you drive your own loop. |
sample_map(m,f) | Vec<R> | a parallel Monte-Carlo reduction — you only need a function of each path (payoff, terminal value, a Greek). Reuses one sampler and one output buffer per worker; nothing is materialised. This is the fast path. |
sample_par(m) | Vec<Output> | you need to keep all m paths (storage, plotting, path-dependent processing). Allocates each path fresh. |
Both parallel methods build one sampler per Rayon worker, not one per
path, which removes the per-path RNG-setup and (for sample_map) the
per-path allocation. Reach for sample_map whenever the answer is a number,
not the paths themselves.
Performance
The reuse matters most in the many-short-paths regime (large m, small
n), where the fixed per-path costs are a large fraction of the work. On
Apple Silicon, GBM, a parallel reduction over
path-elements (cargo bench --bench sampler_compare):
n | old idiom (parallel sample() fold) | sample_map | speedup |
|---|---|---|---|
| 64 | 364 µs | 189 µs | 1.93× |
| 256 | 188 µs | 173 µs | 1.09× |
| 1024 | 165 µs | 159 µs | 1.04× |
The dominant lever is the auto-seed counter becoming contention-free (see seeding); buffer reuse and uninitialised allocation add the rest. For longer paths () the path computation itself dominates, so the speedup tapers to a few percent. Serial and single-path sampling are within a few percent of the pre-2.x figures — the win is specifically parallel and short-path.
Parallel sampling and determinism
Each Rayon worker derives its own RNG from the process's seed source. The consequences depend on the seed strategy:
Deterministic+sample()(serial) — bit-for-bit reproducible. Repeated calls advance the seed, so successive paths differ. A serial loop ofsample()is the way to get reproducible Monte-Carlo runs.Deterministic+sample_par/sample_map(parallel) — every path is an independent, valid draw, but the order in which workers consume derived seeds depends on Rayon scheduling, so the result is not bit-reproducible across runs. Use it for production throughput where statistical reproducibility suffices; use a serial loop when you need bit-exact replay.Unseeded— every path is independently auto-seeded in either mode.
In all modes the seeds handed out are globally unique, so paths never
collide or "get stuck" on a repeated stream — including across the
thread-local seed-block boundary (validated in
tests/sampler_v3_rng.rs).
FGN ships a sample_pair fast path that produces two independent paths in
one FFT — a 2× shortcut over calling sample() twice while maintaining
exact independence.
Acceleration
Default implementation: CPU SIMD via f64x4 / f32x8 where the sampler
allows. GPU backends (cuda, metal) are opt-in features and currently
ship for FGN / fBM only — see the add-gpu-sampler SKILL.
Construction patterns
Every process follows the same canonical new(args, seed) constructor
(see the seeding concept page for the full
design):
use stochastic_rs::simd_rng::{Deterministic, Unseeded};
let p = Foo::<T, _>::new(/* params */, n, x0, t, Unseeded); // auto-seeded
let p = Foo::<T, _>::new(/* params */, n, x0, t, Deterministic::new(42)); // reproducible
let p = Foo::<T, _>::new(/* params */, n, x0, t, shared_seed_source); // chain seedsUse Deterministic in tests so they replay bit-exactly. Pass a shared
seed source when chaining correlated processes — each seed.rng() /
seed.rng_ext() call atomically advances the internal counter, so the
streams diverge despite the shared root. SeedExt::reseed(u64) swaps a
Deterministic source in place to sweep seeds without rebuilding the
process.
Prelude
stochastic_rs::prelude — twenty items in five groups that cover ~95% of day-to-day usage. What is in the prelude and what is intentionally kept out.
DistributionExt
Closed-form pdf, cdf, characteristic function, and moments for every distribution — 18 of 19 closed-form, with five named unimplemented moments.