Python class

AudioGenerationOutput

`AudioGenerationOutput`

class max.interfaces.AudioGenerationOutput(final_status, steps_executed, audio_data=<factory>, buffer_speech_tokens=None, metadata=<factory>)

source

Bases: Struct

Represents a response from the audio generation API.

This class encapsulates the result of an audio generation request, including the final status, generated audio data, and optional buffered speech tokens.

Parameters:

final_status (GenerationStatus)
steps_executed (int)
audio_data (ndarray[tuple[Any, ...], dtype[float32]])
buffer_speech_tokens (ndarray[tuple[Any, ...], dtype[integer[Any]]] | None)
metadata (AudioGenerationMetadata)

`audio_data`

audio_data: ndarray[tuple[Any, ...], dtype[float32]]

source

The generated audio data, if available.

`buffer_speech_tokens`

buffer_speech_tokens: ndarray[tuple[Any, ...], dtype[integer[Any]]] | None

source

Buffered speech tokens, if available.

`final_status`

final_status: GenerationStatus

source

The final status of the generation process.

`is_done`

property is_done: bool

source

Indicates whether the audio generation process is complete.

Returns:: True if generation is done, False otherwise.

`metadata`

metadata: AudioGenerationMetadata

source

Metadata associated with the audio generation, such as chunk information, prompt details, or other relevant context.

`steps_executed`

steps_executed: int

source

The number of steps previously executed.

AudioGenerationOutput​

audio_data​

buffer_speech_tokens​

final_status​

is_done​

metadata​

steps_executed​