IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Python class

UnifiedEagleOutputs

UnifiedEagleOutputs​

class max.pipelines.lib.UnifiedEagleOutputs(*, logits=None, next_token_logits=None, logit_offsets=None, hidden_states=None, num_accepted_draft_tokens, next_tokens, next_draft_tokens)

source

Bases: ModelOutputs

Outputs from a unified EAGLE graph execution.

Parameters:

  • logits (Buffer | None)
  • next_token_logits (None)
  • logit_offsets (None)
  • hidden_states (None)
  • num_accepted_draft_tokens (Buffer)
  • next_tokens (Buffer)
  • next_draft_tokens (Buffer)

hidden_states​

hidden_states: None = None

source

Optional hidden states for text generation.

Single-device shape is [T_h, H] where:

  • none mode: NONE (default)
  • last-token mode: T_h = B
  • all-token mode: T_h = total_input_tokens

For data parallel models, the hs will be on the first gpu since it is replicated.

logit_offsets​

logit_offsets: None = None

source

Cumulative row offsets into logits for text generation.

Shape is [B + 1]. Per-sequence logits are: logits[logit_offsets[i]:logit_offsets[i + 1], :].

logits​

logits: Buffer | None = None

source

Primary logits buffer.

For text generation this has shape [T, V] where:

  • last-token mode: T = B (default)
  • all-token mode: T = total_input_tokens
  • variable mode: T = logit_offsets[-1] (typically B * return_n_logits)

next_draft_tokens​

next_draft_tokens: Buffer

source

next_token_logits​

next_token_logits: None = None

source

Next-token logits for text generation, shape [B, V] when present.

next_tokens​

next_tokens: Buffer

source

num_accepted_draft_tokens​

num_accepted_draft_tokens: Buffer

source