Python module
max.interfaces
Universal interfaces between all aspects of the MAX Inference Stack.
Pipeline base
Pipeline | Abstract base class for pipeline operations. |
|---|---|
PipelineInputs | Base class representing inputs to a pipeline operation. |
PipelineOutput | Protocol representing the output of a pipeline operation. |
PipelinesFactory | Type alias for factory functions that create pipeline instances. |
PipelineTask | Enum representing the types of pipeline tasks supported. |
PipelineTokenizer | Interface for LLM tokenizers. |
Text generation
BatchType | Type of batch. |
|---|---|
MessageContent | Represent a PEP 604 union type |
TextContentPart | A plain-text content part of a message. |
TextGenerationContext | Protocol defining the interface for text generation contexts in token generation. |
TextGenerationInputs | Input parameters for text generation pipeline operations. |
TextGenerationOutput | Represents the output of a text generation operation. |
TextGenerationRequest | An immutable request for text token generation from a pipeline. |
TextGenerationRequestFunction | Represents a function definition for a text generation request. |
TextGenerationRequestMessage | A single message in a text generation request conversation. |
TextGenerationRequestTool | Represents a tool definition for a text generation request. |
TextGenerationResponseFormat | Represents the response format specification for a text generation request. |
VLMTextGenerationContext | Protocol defining the interface for VLM input contexts. |
Embeddings
EmbeddingsContext | Protocol defining the interface for embeddings generation contexts. |
|---|---|
EmbeddingsGenerationInputs | Batched inputs for an embeddings generation pipeline step. |
EmbeddingsGenerationOutput | Response structure for embedding generation. |
Audio generation
AudioGenerationInputs | Input data structure for audio generation pipelines. |
|---|---|
AudioGenerationMetadata | Represents metadata associated with audio generation. |
AudioGenerationOutput | Represents a response from the audio generation API. |
AudioGenerationRequest | An immutable request for audio generation from a pipeline. |
Image generation
ImageContentPart | An image content part of a message. |
|---|---|
ImageMetadata | Metadata about an image in the prompt. |
PixelGenerationContext | Protocol defining the interface for pixel generation contexts. |
PixelGenerationInputs | Input data structure for pixel generation pipelines. |
Context and sampling
BaseContext | Core interface for request lifecycle management across all of MAX, including serving, scheduling, and pipelines. |
|---|---|
GenerationOutput | Output container for image generation pipeline operations. |
GenerationStatus | Enum representing the status of a generation process in the MAX API. |
SamplingParams | Request specific sampling parameters that are only known at run time. |
SamplingParamsGenerationConfigDefaults | Default sampling parameter values extracted from a model's GenerationConfig. |
SamplingParamsInput | Input dataclass for creating SamplingParams instances. |
Requests and scheduling
OpenResponsesRequest | General request container for OpenResponses API requests. |
|---|---|
Request | Protocol representing a generic request within the MAX API. |
RequestID | A unique immutable identifier for a request. |
Scheduler | Abstract base class defining the interface for schedulers. |
SchedulerResult | Structure representing the result of a scheduler operation for a specific pipeline execution. |
Tokens
LogProbabilities | Log probabilities for an individual output token. |
|---|---|
TokenBuffer | A dynamically resizable container for managing token sequences. |
TokenSlice | ndarray(shape, dtype=float, buffer=None, offset=0, |
Logit processors
BatchLogitsProcessor | alias of Callable[[BatchProcessorInputs], None] |
|---|---|
BatchProcessorInputs | Arguments for a batch logits processor. |
LogitsProcessor | alias of Callable[[ProcessorInputs], None] |
ProcessorInputs | Inputs passed to a logits processor callback. |
LoRA
LoRAOperation | Enum for different LoRA operations. |
|---|---|
LoRARequest | Container for LoRA adapter requests. |
LoRAResponse | Response from LoRA operations. |
LoRAStatus | Enum for LoRA operation status. |
LoRAType | Enumeration for LoRA Types. |
Queues
MAXPullQueue | Protocol for a minimal, non-blocking pull queue interface in MAX. |
|---|---|
MAXPushQueue | Protocol for a minimal, non-blocking push queue interface in MAX. |
drain_queue | Remove and return items from the queue without blocking. |
|---|---|
get_blocking | Get the next item from the queue. |
Utilities
SharedMemoryArray | A wrapper for a NumPy array stored in shared memory. |
|---|
msgpack_numpy_decoder | Create a decoder function for the specified type. |
|---|---|
msgpack_numpy_encoder | Create an encoder function that handles numpy arrays. |
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!