For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Python function
build_stochastic_acceptance_sampler_graph
build_stochastic_acceptance_sampler_graph()โ
max.pipelines.sampling.build_stochastic_acceptance_sampler_graph(device)
Builds a target-only stochastic rejection sampler for speculative decoding.
Accepts draft tokens based on coin < p_target(draft_token) where
p_target is computed after applying temperature, top-k, and top-p
filtering. No draft probabilities are needed.
The sampling RNG seed is bound as a graph input โ callers refresh it per execution so RNG varies across calls.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!