Skip to main content

/

Python module

max.pipelines

Types to interface with ML pipelines such as text/token/pixel generation.

Configuration

`AudioGenerationConfig`	Configuration for an audio generation pipeline.
`KVCacheConfig`	Configuration for the paged KV cache.
`LoRAConfig`	Configuration for LoRA (Low-Rank Adaptation) inference.
`MAXModelConfig`	Configuration for a pipeline model.
`PipelineConfig`	Configuration for a pipeline.
`ProfilingConfig`	Configuration for GPU profiling of pipeline models.
`SamplingConfig`	Configuration for the sampling stage of token generation.
`SpeculativeConfig`	Configuration for speculative decoding.

Pipelines

`EmbeddingsPipeline`	Generalized token generator pipeline.
`PixelGenerationPipeline`	Pixel generation pipeline for diffusion models.
`SpeechTokenGenerationPipeline`	A text-to-speech token generation pipeline for TTS models.
`TextGenerationPipeline`	Generalized token generator pipeline.
`TextGenerationPipelineInterface`	Interface for text generation pipelines.

Model interface

`GenerateMixin`	Protocol for pipelines that support text generation.
`MemoryEstimator`	Estimates available memory for pipeline model allocation.
`ModelInputs`	Base class for model inputs.
`ModelOutputs`	Pipeline model outputs.
`PipelineModel`	A pipeline model with setup, input preparation and execution methods.

Context

`PixelContext`	A model-ready context for image/video generation requests.
`TextAndVisionContext`	A base class for model context, specifically for Vision model variants.
`TextContext`	A base class for model context, specifically for Text model variants.
`TTSContext`	A context for Text-to-Speech (TTS) model inference.

Tokenizers

`IdentityPipelineTokenizer`	A pass-through tokenizer that returns prompts unchanged.
`PreTrainedPipelineTokenizer`	A pipeline tokenizer backed by a Hugging Face pre-trained tokenizer.
`TextAndVisionTokenizer`	Encapsulates creation of TextAndVisionContext and specific token encode/decode logic.
`TextTokenizer`	Encapsulates creation of TextContext and specific token encode/decode logic.

Enums

`PipelineRole`	alias of `Literal`['prefill_and_decode', 'prefill_only', 'decode_only']
`PrometheusMetricsMode`	alias of `Literal`['instrument_only', 'launch_server', 'launch_multiproc_server']
`RepoType`	alias of `Literal`['online', 'local']
`RopeType`	alias of `Literal`['none', 'normal', 'neox', 'longrope', 'yarn']
`SupportedEncoding`	alias of `Literal`['float32', 'bfloat16', 'q4_k', 'q4_0', 'q6_k', 'float8_e4m3fn', 'float4_e2m1fnx2', 'gptq']

Utilities

`PrependPromptSpeechTokens`	alias of `Literal`['never', 'once', 'rolling']

`download_weight_files`	Downloads weight files for a Hugging Face model and returns local paths.
`is_float4_encoding`	Returns whether the given encoding is a float4 type.
`parse_supported_encoding_from_file_name`	Infers a SupportedEncoding from a file name string.
`supported_encoding_dtype`	Returns the underlying model dtype for the given encoding.
`supported_encoding_quantization`	Returns the QuantizationEncoding for the given encoding.
`supported_encoding_supported_devices`	Returns the devices that the given encoding is supported on.
`supported_encoding_supported_on`	Returns whether the given encoding is supported on a device.
`upper_bounded_default`	Returns a value not exceeding the upper bound.

Submodules

Configuration
Pipelines
Model interface
Context
Tokenizers
Enums
Utilities
Submodules