Skip to main content

Python class

SpeculativeConfig

SpeculativeConfig

class max.pipelines.SpeculativeConfig(*, config_file=None, section_name=None, speculative_method=None, num_speculative_tokens=2, rejection_sampling_strategy=None, synthetic_acceptance_rate=None)

source

Bases: ConfigFileModel

Configuration for speculative decoding.

Parameters:

  • config_file (str | None)
  • section_name (str | None)
  • speculative_method (Literal['standalone', 'eagle', 'mtp'] | None)
  • num_speculative_tokens (int)
  • rejection_sampling_strategy (Literal['greedy', 'residual', 'typical-acceptance', 'logit-comparison'] | None)
  • synthetic_acceptance_rate (float | None)

is_eagle()

is_eagle()

source

Returns whether the speculative method is EAGLE (shared embedding/lm_head).

Return type:

bool

is_mtp()

is_mtp()

source

Returns whether the speculative method is MTP.

Return type:

bool

is_standalone()

is_standalone()

source

Returns whether the speculative method is a standalone model.

Return type:

bool

model_config

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'strict': False}

source

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init()

model_post_init(context, /)

source

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Parameters:

  • self (BaseModel) – The BaseModel instance.
  • context (Any) – The context.

Return type:

None

num_speculative_tokens

num_speculative_tokens: int

source

The number of speculative tokens to generate per step.

rejection_sampling_strategy

rejection_sampling_strategy: RejectionSamplingStrategy | None

source

speculative_method

speculative_method: SpeculativeMethod | None

source

The speculative decoding method to use.

synthetic_acceptance_rate

synthetic_acceptance_rate: float | None

source

uses_greedy_rejection()

uses_greedy_rejection()

source

Returns whether the greedy rejection sampling strategy is used.

Return type:

bool

uses_logit_comparison()

uses_logit_comparison()

source

Returns whether the logit-comparison sampling strategy is used.

Return type:

bool

uses_typical_acceptance()

uses_typical_acceptance()

source

Returns whether the typical-acceptance sampling strategy is used.

Return type:

bool