For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Python module
max.pipelines.speculative
Speculative decoding pipelines and configuration for MAX.
Configuration
RejectionSamplingStrategy | alias of Literal['greedy', 'residual', 'typical-acceptance', 'logit-comparison'] |
|---|---|
SpeculativeConfig | Configures speculative decoding for a pipeline. |
SpeculativeMethod | alias of Literal['standalone', 'eagle', 'mtp', 'dflash'] |
Token merging
RaggedTokenMerger | Merges prompt and draft token sequences into a single ragged batch. |
|---|
ragged_token_merger | Builds a graph that merges prompt and draft tokens into a single ragged sequence. |
|---|
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!