For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python function

token_sampler

`token_sampler()`

max.pipelines.sampling.token_sampler(sampling_config, device, return_logits=False, needs_bitmask_input=None, custom_extensions=())

source

Builds a sampling graph that samples tokens from logits.

Parameters:

sampling_config (SamplingConfig) – Sampling configuration (top-k, temperature, etc.).
device (DeviceRef) – Device for the graph inputs and ops.
return_logits (bool) – Whether the graph should expose logits as an output.
needs_bitmask_input (bool | None) – Whether to wire a bitmask input into the graph. When None, falls back to sampling_config.enable_structured_output. Callers should pass True explicitly when tool-call grammars can fire even though --enable-structured-output is off.
custom_extensions (Iterable[Path]) – Custom-op extension paths to compile the graph with. Empty by default.

Returns:

A graph that takes logits (and optional penalty inputs) and outputs tokens.

Return type:

Graph

token_sampler()​

`token_sampler()`