Skip to main content

Python module

max.pipelines.modeling.types

Universal interfaces between all aspects of the MAX Inference Stack.

Pipeline base

`InputModality`	Enum representing the types of input a model architecture accepts.
`Pipeline`	Defines the interface for pipeline operations.
`PipelineInputs`	Base class representing inputs to a pipeline operation.
`PipelineInputsType`	Type variable.
`PipelineOutput`	Protocol representing the output of a pipeline operation.
`PipelineOutputType`	Type variable.
`PipelineOutputsDict`	dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object's (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict( `<br/>**<br/>` kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2).
`PipelinesFactory`	Type alias for factory functions that create pipeline instances.
`PipelineTask`	Enum representing the types of pipeline tasks supported.
`PipelineTokenizer`	Interface for LLM tokenizers.
`TokenizerEncoded`	Type variable.
`UnboundContextType`	Type variable.

Text generation

`BatchType`	Type of batch.
`MessageContent`	Represent a PEP 604 union type
`SpecDecodingState`	Per-request state for speculative decoding.
`TextContentPart`	A plain-text content part of a message.
`TextGenerationContext`	Protocol defining the interface for text generation contexts in token generation.
`TextGenerationContextType`	Type variable.
`TextGenerationInputs`	Input parameters for text generation pipeline operations.
`TextGenerationOutput`	Represents the output of a text generation operation.
`TextGenerationRequest`	An immutable request for text token generation from a pipeline.
`TextGenerationRequestFunction`	Represents a function definition for a text generation request.
`TextGenerationRequestMessage`	A single message in a text generation request conversation.
`TextGenerationRequestTool`	Represents a tool definition for a text generation request.
`TextGenerationResponseFormat`	Represents the response format specification for a text generation request.
`VLMTextGenerationContext`	Protocol defining the interface for VLM input contexts.

Embeddings

`EmbeddingsContext`	Protocol defining the interface for embeddings generation contexts.
`EmbeddingsGenerationContextType`	Type variable.
`EmbeddingsGenerationInputs`	Batched inputs for an embeddings generation pipeline step.
`EmbeddingsGenerationOutput`	Response structure for embedding generation.

Audio generation

`AudioGenerationContextType`	Type variable.
`AudioGenerationInputs`	Input data structure for audio generation pipelines.
`AudioGenerationMetadata`	Represents metadata associated with audio generation.
`AudioGenerationOutput`	Represents a response from the audio generation API.
`AudioGenerationRequest`	An immutable request for audio generation from a pipeline.

Image generation

`ImageContentPart`	An image content part of a message.
`ImageMetadata`	Metadata about an image in the prompt.
`PixelGenerationContext`	Protocol defining the interface for pixel generation contexts.
`PixelGenerationContextType`	Type variable.
`PixelGenerationInputs`	Input data structure for pixel generation pipelines.
`VideoContentPart`	A video content part of a message.

Reasoning

`ParsedReasoningDelta`	Result of applying reasoning parsing to a streaming delta chunk.
`ReasoningParser`	Parser for identifying reasoning spans in model output.
`ReasoningSpan`	Identifies a reasoning span within a token ID sequence.

Tool parsing

`ParsedToolCall`	A parsed tool/function call extracted from model output.
`ParsedToolCallDelta`	Incremental tool call data for streaming responses.
`ParsedToolResponse`	Result of parsing a complete model response for tool calls.
`ToolParser`	Protocol for parsing tool calls from model responses.

Context and sampling

`BaseContext`	Core interface for request lifecycle management across all of MAX, including serving, scheduling, and pipelines.
`BaseContextType`	Type variable.
`EOSTracker`	Centralized EOS tracking: single-ID, sequence-ID, and stop-sequence checks.
`GenerationOutput`	Output container for image generation pipeline operations.
`GenerationStatus`	Enum representing the status of a generation process in the MAX API.
`SamplingParams`	Request specific sampling parameters that are only known at run time.
`SamplingParamsGenerationConfigDefaults`	Default sampling parameter values extracted from a model's GenerationConfig.
`SamplingParamsInput`	Input dataclass for creating SamplingParams instances.

Requests

`OpenResponsesRequest`	General request container for OpenResponses API requests.
`Request`	Protocol representing a generic request within the MAX API.
`RequestID`	A unique immutable identifier for a request.
`RequestType`	Type variable.

`DUMMY_REQUEST_ID`	A unique immutable identifier for a request.

Tokens

`LogProbabilities`	Log probabilities for an individual output token.
`Range`	Represents a range with start and end indices.
`TokenBuffer`	A dynamically resizable container for managing token sequences.
`TokenSlice`	ndarray(shape, dtype=float, buffer=None, offset=0,

Logit processors

`BatchLogitsProcessor`	alias of `Callable`[[`BatchProcessorInputs`], `None`]
`BatchProcessorInputs`	Arguments for a batch logits processor.
`LogitsProcessor`	alias of `Callable`[[`ProcessorInputs`], `None`]
`ProcessorInputs`	Inputs passed to a logits processor callback.

LoRA

`LoRAOperation`	Enum for different LoRA operations.
`LoRARequest`	Container for LoRA adapter requests.
`LoRAResponse`	Response from LoRA operations.
`LoRAStatus`	Enum for LoRA operation status.
`LoRAType`	Enumeration for LoRA Types.

`LORA_REQUEST_ENDPOINT`	str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
`LORA_RESPONSE_ENDPOINT`	str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Utilities

`SharedMemoryArray`	A wrapper for a NumPy array stored in shared memory.

`msgpack_numpy_decoder`	Create a decoder function for the specified type.
`msgpack_numpy_encoder`	Create an encoder function that handles numpy arrays.

Pipeline base
Text generation
Embeddings
Audio generation
Image generation
Reasoning
Tool parsing
Context and sampling
Requests
Tokens
Logit processors
LoRA
Utilities