IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python module

max.pipelines.sampling

Configuration​

SamplingConfigConfiguration for the sampling stage of token generation.

Processors​

FrequencyDataContainer for token frequency data in CSR format.
FusedSamplingProcessorApplies sampling parameters to logits and stores the chosen tokens.
PenaltyInputsContainer for penalty inputs.
SamplerInputsContainer for sampler inputs.

Samplers​

RejectionRunnerInterface for rejection sampling runners.
SyntheticRunnerSynthetic acceptance sampler for benchmarking.
TokenSamplerSamples tokens from the logits.
rejection_runner_registryGiven a rejection runner strategy, returns the type of RejectionRunner.
rejection_samplerBuilds a graph that implements speculative decoding rejection sampling.
rejection_sampler_with_residualsBuilds a rejection sampler with residual sampling for speculative decoding.
token_samplerBuilds a sampling graph that samples tokens from logits.

Logits processing​

apply_logits_processorsApplies logits processors to a batch of logits.
build_greedy_acceptance_sampler_graphBuilds a graph that implements strict greedy acceptance for MTP.
build_stochastic_acceptance_sampler_graphBuilds a target-only stochastic rejection sampler for speculative decoding.
build_synthetic_acceptance_sampler_graphBuilds a graph that implements synthetic acceptance sampling.