IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python function

build_stochastic_acceptance_sampler_graph

build_stochastic_acceptance_sampler_graph()โ€‹

max.pipelines.sampling.build_stochastic_acceptance_sampler_graph(device)

source

Builds a target-only stochastic rejection sampler for speculative decoding.

Accepts draft tokens based on coin < p_target(draft_token) where p_target is computed after applying temperature, top-k, and top-p filtering. No draft probabilities are needed.

The sampling RNG seed is bound as a graph input โ€” callers refresh it per execution so RNG varies across calls.

Parameters:

device (DeviceRef) โ€“ Device for the graph.

Returns:

A graph that takes draft tokens, target logits, target logit offsets, sampling parameters, and a per-execute seed, and outputs the first rejected index, recovered tokens, and a bonus token.

Return type:

Graph