Python class
SamplingParams
SamplingParams
class max.interfaces.SamplingParams(top_k=-1, top_p=1, min_p=0.0, temperature=1, frequency_penalty=0.0, presence_penalty=0.0, repetition_penalty=1.0, max_new_tokens=None, min_new_tokens=0, ignore_eos=False, stop=None, stop_token_ids=None, detokenize=True, seed=<factory>, logits_processors=None)
Bases: object
Request specific sampling parameters that are only known at run time.
-
Parameters:
-
- top_k (int)
- top_p (float)
- min_p (float)
- temperature (float)
- frequency_penalty (float)
- presence_penalty (float)
- repetition_penalty (float)
- max_new_tokens (int | None)
- min_new_tokens (int)
- ignore_eos (bool)
- stop (list[str] | None)
- stop_token_ids (list[int] | None)
- detokenize (bool)
- seed (int)
- logits_processors (Sequence[Callable[[ProcessorInputs], None]] | None)
detokenize
detokenize: bool = True
Whether to detokenize the output tokens into text.
frequency_penalty
frequency_penalty: float = 0.0
The frequency penalty to apply to the model’s output. A positive value will penalize new tokens based on their frequency in the generated text: tokens will receive a penalty proportional to the count of appearances.
from_input_and_generation_config()
classmethod from_input_and_generation_config(input_params, sampling_params_defaults)
Creates a SamplingParams instance with defaults from a HuggingFace GenerationConfig.
Combines three sources of values in priority order (highest to lowest):
- User-provided values in
input_params(non-None) - Model’s GenerationConfig values (only if explicitly set in the model’s config)
SamplingParamsclass defaults
-
Parameters:
-
- input_params (SamplingParamsInput) – Dataclass containing user-specified parameter values.
Values of
Nonewill be replaced with model defaults or class defaults. - sampling_params_defaults (SamplingParamsGenerationConfigDefaults) –
SamplingParamsGenerationConfigDefaultscontaining default sampling parameters extracted from the model’s GenerationConfig.
- input_params (SamplingParamsInput) – Dataclass containing user-specified parameter values.
Values of
-
Returns:
-
A new
SamplingParamsinstance with model-aware defaults. -
Return type:
params = SamplingParams.from_input_and_generation_config(
SamplingParamsInput(temperature=0.7),
sampling_params_defaults=model_config.sampling_params_defaults,
)ignore_eos
ignore_eos: bool = False
If True, the response will ignore the EOS token, and continue to
generate until the max tokens or a stop string is hit.
log_sampling_info()
log_sampling_info()
Logs comprehensive sampling parameters information.
Displays all sampling parameters in a consistent visual format similar to pipeline configuration logging.
-
Return type:
-
None
logits_processors
logits_processors: Sequence[Callable[[ProcessorInputs], None]] | None = None
Callables to post-process the model logits.
See LogitsProcessor for examples.
max_new_tokens
The maximum number of new tokens to generate in the response.
When set to an integer value, generation will stop after this many tokens.
When None (default), the model may generate tokens until it reaches its
internal limits or other stopping criteria are met.
min_new_tokens
min_new_tokens: int = 0
The minimum number of tokens to generate in the response.
min_p
min_p: float = 0.0
Float that represents the minimum probability for a token to be considered, relative to the probability of the most likely token. Must be in [0, 1]. Set to 0 to disable this.
needs_penalties
property needs_penalties: bool
Whether penalties are needed for the set of sampling parameters.
presence_penalty
presence_penalty: float = 0.0
The presence penalty to apply to the model’s output. A positive value will penalize new tokens that have already appeared in the generated text at least once by applying a constant penalty.
repetition_penalty
repetition_penalty: float = 1.0
The repetition penalty to apply to the model’s output. Values > 1 will penalize new tokens that have already appeared in the generated text at least once by dividing the logits by the repetition penalty.
seed
seed: int
The seed to use for the random number generator. Defaults to a cryptographically secure random value.
stop
A list of detokenized sequences that can be used as stop criteria when generating a new sequence.
stop_token_ids
A list of token ids that are used as stopping criteria when generating a new sequence.
temperature
temperature: float = 1
Controls the randomness of the model’s output; higher values produce more diverse responses. For greedy sampling, set to temperature to 0.
top_k
top_k: int = -1
Limits the sampling to the K most probable tokens. This defaults to -1 (to sample all tokens), for greedy sampling set to 1.
top_p
top_p: float = 1
Only use the tokens whose cumulative probability is within the top_p threshold. This applies to the top_k tokens.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!