IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python class

SamplingParamsInput

SamplingParamsInput​

class max.pipelines.context.SamplingParamsInput(top_k=None, top_p=None, min_p=None, temperature=None, thinking_temperature=None, frequency_penalty=None, presence_penalty=None, repetition_penalty=None, max_new_tokens=None, min_new_tokens=None, ignore_eos=None, stop=None, stop_token_ids=None, detokenize=None, seed=None, logits_processors=None)

source

Bases: object

Input dataclass for creating SamplingParams instances.

All fields are optional, allowing partial specification with None values indicating β€œuse default”. This enables static type checking while maintaining the flexibility to specify only the parameters you want to override.

Parameters:

detokenize​

detokenize: bool | None = None

source

Whether to convert output token IDs back to text. Defaults to None (use class default).

frequency_penalty​

frequency_penalty: float | None = None

source

The penalty applied proportionally to token frequency in the generated text. Defaults to None (use class default).

ignore_eos​

ignore_eos: bool | None = None

source

Whether to continue generating past end-of-sequence tokens. Defaults to None (use class default).

logits_processors​

logits_processors: Sequence[Callable[[ProcessorInputs], None]] | None = None

source

Callables applied to model logits before sampling. Defaults to None (use class default).

max_new_tokens​

max_new_tokens: int | None = None

source

The maximum number of tokens to generate. Defaults to None (use class default).

min_new_tokens​

min_new_tokens: int | None = None

source

The minimum number of tokens to generate before stopping. Defaults to None (use class default).

min_p​

min_p: float | None = None

source

The minimum probability threshold for a token relative to the most likely token. Defaults to None (use class default).

presence_penalty​

presence_penalty: float | None = None

source

The flat penalty applied to tokens that have appeared at least once. Defaults to None (use class default).

repetition_penalty​

repetition_penalty: float | None = None

source

The factor by which logits of repeated tokens are divided. Defaults to None (use class default).

seed​

seed: int | None = None

source

The random seed for reproducible sampling. Defaults to None (use class default).

stop​

stop: list[str] | None = None

source

A list of strings that, when generated, will stop the generation. Defaults to None (use class default).

stop_token_ids​

stop_token_ids: list[int] | None = None

source

A list of token IDs that, when generated, will stop the generation. Defaults to None (use class default).

temperature​

temperature: float | None = None

source

The temperature for controlling output randomness. Defaults to None (use class default).

thinking_temperature​

thinking_temperature: float | None = None

source

Temperature override for tokens inside a <think>...</think> block. Requires a configured reasoning parser to resolve boundary token IDs.

top_k​

top_k: int | None = None

source

The number of most probable tokens to keep when sampling. Defaults to None (use class default).

top_p​

top_p: float | None = None

source

The cumulative probability threshold for nucleus sampling. Defaults to None (use class default).