The ProcessExt contract — sample, sample_map, sample_par — covering output shape, time grid, buffer reuse, and parallel determinism.

`ProcessExt<T>`

Every stochastic process in stochastic-rs-stochastic — diffusion, jump, volatility, interest, rough, noise — implements ProcessExt<T>. The public surface is three methods:

pub trait ProcessExt<T: FloatExt>: Send + Sync {
    type Output: Send;

    /// One sampled path.
    fn sample(&self) -> Self::Output;

    /// Map `f` over `m` independently sampled paths, in parallel.
    fn sample_map<R: Send>(&self, m: usize, f: impl Fn(&Self::Output) -> R + Sync) -> Vec<R>;

    /// `m` independently sampled paths, kept.
    fn sample_par(&self, m: usize) -> Vec<Self::Output>;
}

Under these sits a reusable per-thread sampler that owns the mutable sampling state (RNG, distribution buffers, precomputed scales). It is an implementation detail — you never name or construct it; the three methods above are the whole interface.

Output shape

Self::Output depends on the process family:

Family	`Output`
Single-factor (OU, GBM, CIR, Vasicek, …)	`Array1<T>`
Multi-state (Heston, SABR, Bergomi, …)	`[Array1<T>; N]`
Term structure / sheet (HJM tenors, Bgm, Fbs)	`Array2<T>`
Variable-dimension (multivariate Hawkes)	`Vec<Array1<T>>`
Complex-state (cfOU)	`Array1<Complex<T>>`

The dimensional marker traits (OneDimensional, TwoDimensional, MultiDimensional<N>, CurveOutput, VariableDimensional, ComplexPathOutput) are auto-implemented from Output, so generic code can bound on the shape it expects.

Path length and time grid

Construction of any process takes (n, x0, t):

n: usize — number of steps (a single path is length n)
x0: Option<T> — initial value; None ⇒ a sensible default per process (e.g. 0 for OU, 1 for GBM)
t: Option<T> — horizon; None ⇒ 1.0

The time grid is uniform: $\Delta t = t / n$ . Non-uniform grids are handled by user code (sample at fine $\Delta t$ , then resample).

Choosing a method

use stochastic_rs::stochastic::diffusion::gbm::Gbm;
use stochastic_rs::simd_rng::Unseeded;

let gbm = Gbm::<f64, _>::new(0.05, 0.2, 64, Some(1.0), Some(1.0), Unseeded);

// one path
let path = gbm.sample();

// 1M paths, reduced to a payoff mean — no per-path allocation
let mean = gbm.sample_map(1_000_000, |p| (p.last().unwrap() - 1.0).max(0.0)).iter().sum::<f64>()
    / 1_000_000.0;

// 1k paths, kept (for plotting / path-dependent post-processing)
let paths = gbm.sample_par(1_000);

Method	Returns	Use when
`sample()`	one `Output`	a single path, or you drive your own loop.
`sample_map(m,f)`	`Vec<R>`	a parallel Monte-Carlo reduction — you only need a function of each path (payoff, terminal value, a Greek). Reuses one sampler and one output buffer per worker; nothing is materialised. This is the fast path.
`sample_par(m)`	`Vec<Output>`	you need to keep all `m` paths (storage, plotting, path-dependent processing). Allocates each path fresh.

Both parallel methods build one sampler per Rayon worker, not one per path, which removes the per-path RNG-setup and (for sample_map) the per-path allocation. Reach for sample_map whenever the answer is a number, not the paths themselves.

Performance

The reuse matters most in the many-short-paths regime (large m, small n), where the fixed per-path costs are a large fraction of the work. On Apple Silicon, GBM, a parallel reduction over $\approx 2.6 \times 10^5$ path-elements (cargo bench --bench sampler_compare):

`n`	old idiom (parallel `sample()` fold)	`sample_map`	speedup
64	364 µs	189 µs	1.93×
256	188 µs	173 µs	1.09×
1024	165 µs	159 µs	1.04×

The dominant lever is the auto-seed counter becoming contention-free (see seeding); buffer reuse and uninitialised allocation add the rest. For longer paths ( $n \geq 256$ ) the path computation itself dominates, so the speedup tapers to a few percent. Serial and single-path sampling are within a few percent of the pre-2.x figures — the win is specifically parallel and short-path.

Parallel sampling and determinism

Each Rayon worker derives its own RNG from the process's seed source. The consequences depend on the seed strategy:

Deterministic + sample() (serial) — bit-for-bit reproducible. Repeated calls advance the seed, so successive paths differ. A serial loop of sample() is the way to get reproducible Monte-Carlo runs.
Deterministic + sample_par / sample_map (parallel) — every path is an independent, valid draw, but the order in which workers consume derived seeds depends on Rayon scheduling, so the result is not bit-reproducible across runs. Use it for production throughput where statistical reproducibility suffices; use a serial loop when you need bit-exact replay.
Unseeded — every path is independently auto-seeded in either mode.

In all modes the seeds handed out are globally unique, so paths never collide or "get stuck" on a repeated stream — including across the thread-local seed-block boundary (validated in tests/sampler_v3_rng.rs).

FGN ships a sample_pair fast path that produces two independent paths in one FFT — a 2× shortcut over calling sample() twice while maintaining exact independence.

Acceleration

Default implementation: CPU SIMD via f64x4 / f32x8 where the sampler allows. GPU backends (cuda, metal) are opt-in features and currently ship for FGN / fBM only — see the add-gpu-sampler SKILL.

Construction patterns

Every process follows the same canonical new(args, seed) constructor (see the seeding concept page for the full design):

use stochastic_rs::simd_rng::{Deterministic, Unseeded};

let p = Foo::<T, _>::new(/* params */, n, x0, t, Unseeded);                  // auto-seeded
let p = Foo::<T, _>::new(/* params */, n, x0, t, Deterministic::new(42));    // reproducible
let p = Foo::<T, _>::new(/* params */, n, x0, t, shared_seed_source);        // chain seeds

Use Deterministic in tests so they replay bit-exactly. Pass a shared seed source when chaining correlated processes — each seed.rng() / seed.rng_ext() call atomically advances the internal counter, so the streams diverge despite the shared root. SeedExt::reseed(u64) swaps a Deterministic source in place to sweep seeds without rebuilding the process.

ProcessExt

On this page