Python module
max.pipelines
Types to interface with ML pipelines such as text/token/pixel generation.
Configuration
AudioGenerationConfig | Configuration for an audio generation pipeline. |
|---|---|
KVCacheConfig | Configuration for the paged KV cache. |
LoRAConfig | Configuration for LoRA (Low-Rank Adaptation) inference. |
MAXModelConfig | Configuration for a pipeline model. |
PipelineConfig | Configuration for a pipeline. |
ProfilingConfig | Configuration for GPU profiling of pipeline models. |
SamplingConfig | Configuration for the sampling stage of token generation. |
SpeculativeConfig | Configuration for speculative decoding. |
Pipelines
EmbeddingsPipeline | Generalized token generator pipeline. |
|---|---|
PixelGenerationPipeline | Pixel generation pipeline for diffusion models. |
SpeechTokenGenerationPipeline | A text-to-speech token generation pipeline for TTS models. |
TextGenerationPipeline | Generalized token generator pipeline. |
TextGenerationPipelineInterface | Interface for text generation pipelines. |
Model interface
GenerateMixin | Protocol for pipelines that support text generation. |
|---|---|
MemoryEstimator | Estimates available memory for pipeline model allocation. |
ModelInputs | Base class for model inputs. |
ModelOutputs | Pipeline model outputs. |
PipelineModel | A pipeline model with setup, input preparation and execution methods. |
Context
PixelContext | A model-ready context for image/video generation requests. |
|---|---|
TextAndVisionContext | A base class for model context, specifically for Vision model variants. |
TextContext | A base class for model context, specifically for Text model variants. |
TTSContext | A context for Text-to-Speech (TTS) model inference. |
Tokenizers
IdentityPipelineTokenizer | A pass-through tokenizer that returns prompts unchanged. |
|---|---|
PreTrainedPipelineTokenizer | A pipeline tokenizer backed by a Hugging Face pre-trained tokenizer. |
TextAndVisionTokenizer | Encapsulates creation of TextAndVisionContext and specific token encode/decode logic. |
TextTokenizer | Encapsulates creation of TextContext and specific token encode/decode logic. |
Enums
PipelineRole | alias of Literal['prefill_and_decode', 'prefill_only', 'decode_only'] |
|---|---|
PrometheusMetricsMode | alias of Literal['instrument_only', 'launch_server', 'launch_multiproc_server'] |
RepoType | alias of Literal['online', 'local'] |
RopeType | alias of Literal['none', 'normal', 'neox', 'longrope', 'yarn'] |
SupportedEncoding | alias of Literal['float32', 'bfloat16', 'q4_k', 'q4_0', 'q6_k', 'float8_e4m3fn', 'float4_e2m1fnx2', 'gptq'] |
Utilities
PrependPromptSpeechTokens | alias of Literal['never', 'once', 'rolling'] |
|---|
download_weight_files | Downloads weight files for a Hugging Face model and returns local paths. |
|---|---|
is_float4_encoding | Returns whether the given encoding is a float4 type. |
parse_supported_encoding_from_file_name | Infers a SupportedEncoding from a file name string. |
supported_encoding_dtype | Returns the underlying model dtype for the given encoding. |
supported_encoding_quantization | Returns the QuantizationEncoding for the given encoding. |
supported_encoding_supported_devices | Returns the devices that the given encoding is supported on. |
supported_encoding_supported_on | Returns whether the given encoding is supported on a device. |
upper_bounded_default | Returns a value not exceeding the upper bound. |
Submodules
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!