Python class
ModelOutputs
ModelOutputs
class max.pipelines.ModelOutputs(logits, next_token_logits=None, logit_offsets=None, hidden_states=None)
Bases: object
Pipeline model outputs.
Shape conventions below are for text-generation pipelines:
B: batch sizeV: vocabulary sizeH: hidden-state widthT: number of returned logit rows (depends on return mode)
The shape depends on the value of the ReturnLogits and ReturnHiddenStates
enums. Unless we are running with spec decoding, we use ReturnLogits.LAST_TOKEN
and ReturnHiddenStates.NONE.
-
Parameters:
hidden_states
Optional hidden states for text generation.
Single-device shape is [T_h, H] where:
- none mode: NONE (default)
- last-token mode:
T_h = B - all-token mode:
T_h = total_input_tokens
For data parallel models, the hs will be on the first gpu since it is replicated.
logit_offsets
Cumulative row offsets into logits for text generation.
Shape is [B + 1]. Per-sequence logits are:
logits[logit_offsets[i]:logit_offsets[i + 1], :].
logits
logits: Buffer
Primary logits buffer.
For text generation this has shape [T, V] where:
- last-token mode:
T = B(default) - all-token mode:
T = total_input_tokens - variable mode:
T = logit_offsets[-1](typicallyB * return_n_logits)
next_token_logits
Next-token logits for text generation, shape [B, V] when present.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!