Skip to main content

Python class

UnifiedEagleOutputs

UnifiedEagleOutputs​

class max.pipelines.lib.UnifiedEagleOutputs(*, logits=None, next_token_logits=None, logit_offsets=None, hidden_states=None, num_accepted_draft_tokens, next_tokens, next_draft_tokens)

source

Bases: ModelOutputs

Outputs from a unified EAGLE graph execution.

Parameters:

  • logits (Buffer | None)
  • next_token_logits (None)
  • logit_offsets (None)
  • hidden_states (None)
  • num_accepted_draft_tokens (Buffer)
  • next_tokens (Buffer)
  • next_draft_tokens (Buffer)

hidden_states​

hidden_states: None = None

source

Optional hidden states for text generation.

Single-device shape is [T_h, H] where:

  • none mode: NONE (default)
  • last-token mode: T_h = B
  • all-token mode: T_h = total_input_tokens

For data parallel models, the hs will be on the first gpu since it is replicated.

logit_offsets​

logit_offsets: None = None

source

Cumulative row offsets into logits for text generation.

Shape is [B + 1]. Per-sequence logits are: logits[logit_offsets[i]:logit_offsets[i + 1], :].

logits​

logits: Buffer | None = None

source

Primary logits buffer.

For text generation this has shape [T, V] where:

  • last-token mode: T = B (default)
  • all-token mode: T = total_input_tokens
  • variable mode: T = logit_offsets[-1] (typically B * return_n_logits)

next_draft_tokens​

next_draft_tokens: Buffer

source

next_token_logits​

next_token_logits: None = None

source

Next-token logits for text generation, shape [B, V] when present.

next_tokens​

next_tokens: Buffer

source

num_accepted_draft_tokens​

num_accepted_draft_tokens: Buffer

source