Skip to main content

Python class

AudioGenerationOutput

AudioGenerationOutput

class max.interfaces.AudioGenerationOutput(final_status, steps_executed, audio_data=<factory>, buffer_speech_tokens=None, metadata=<factory>)

source

Bases: Struct

Represents a response from the audio generation API.

This class encapsulates the result of an audio generation request, including the final status, generated audio data, and optional buffered speech tokens.

Parameters:

audio_data

audio_data: ndarray[tuple[Any, ...], dtype[float32]]

source

The generated audio data, if available.

buffer_speech_tokens

buffer_speech_tokens: ndarray[tuple[Any, ...], dtype[integer[Any]]] | None

source

Buffered speech tokens, if available.

final_status

final_status: GenerationStatus

source

The final status of the generation process.

is_done

property is_done: bool

source

Indicates whether the audio generation process is complete.

Returns:

True if generation is done, False otherwise.

metadata

metadata: AudioGenerationMetadata

source

Metadata associated with the audio generation, such as chunk information, prompt details, or other relevant context.

steps_executed

steps_executed: int

source

The number of steps previously executed.