Python class
AudioGenerationConfig
AudioGenerationConfig
class max.pipelines.AudioGenerationConfig(audio_decoder, audio_decoder_weights='', chunk_size=None, buffer=0, block_causal=False, prepend_prompt_speech_tokens='never', prepend_prompt_speech_tokens_causal=False, run_model_test_mode=False, prometheus_metrics_mode='instrument_only', *, config_file=None, section_name=None, debug_verify_replay=False, models=<factory>, model_override=<factory>, sampling=<factory>, profiling=<factory>, lora=None, speculative=None, runtime=<factory>, audio_decoder_config=<factory>)
Bases: PipelineConfig
Configuration for an audio generation pipeline.
-
Parameters:
-
- audio_decoder (str)
- audio_decoder_weights (str)
- chunk_size (list[int] | None)
- buffer (int)
- block_causal (bool)
- prepend_prompt_speech_tokens (Literal['never', 'once', 'rolling'])
- prepend_prompt_speech_tokens_causal (bool)
- run_model_test_mode (bool)
- prometheus_metrics_mode (Literal['instrument_only', 'launch_server', 'launch_multiproc_server'])
- config_file (str | None)
- section_name (str | None)
- debug_verify_replay (bool)
- models (dict[str, MAXModelConfig])
- model_override (list[str])
- sampling (SamplingConfig)
- profiling (ProfilingConfig)
- lora (LoRAConfig | None)
- speculative (SpeculativeConfig | None)
- runtime (PipelineRuntimeConfig)
- audio_decoder_config (dict[str, Any])
audio_decoder
audio_decoder: str
The name of the audio decoder model architecture.
audio_decoder_config
Parameters to pass to the audio decoder model.
audio_decoder_weights
audio_decoder_weights: str
The path to the audio decoder weights file.
block_causal
block_causal: bool
Whether prior buffered tokens attend to tokens in the current block.
buffer
buffer: int
The number of previous speech tokens to pass to the audio decoder on each generation step.
chunk_size
The chunk sizes to use for streaming.
from_flags()
classmethod from_flags(audio_flags, **config_flags)
Builds an AudioGenerationConfig from audio CLI flags and config kwargs.
-
Parameters:
-
Return type:
model_config
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'strict': False}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
model_post_init()
model_post_init(context, /)
This function is meant to behave like a BaseModel method to initialise private attributes.
It takes context as an argument since that’s what pydantic-core passes when calling it.
-
Parameters:
-
- self (BaseModel) – The BaseModel instance.
- context (Any) – The context.
-
Return type:
-
None
prepend_prompt_speech_tokens
prepend_prompt_speech_tokens: PrependPromptSpeechTokens
Whether the prompt speech tokens are forwarded to the audio decoder.
prepend_prompt_speech_tokens_causal
prepend_prompt_speech_tokens_causal: bool
Whether the prompt speech tokens attend to tokens in the current audio block.
prometheus_metrics_mode
prometheus_metrics_mode: PrometheusMetricsMode
The mode to use for Prometheus metrics.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!