Skip to main content

Python class

AudioGenerationMetadata

AudioGenerationMetadata

class max.interfaces.AudioGenerationMetadata(*, sample_rate=None, duration=None, chunk_id=None, timestamp=None, final_chunk=None, model_name=None, request_id=None, tokens_generated=None, processing_time=None, echo=None)

source

Bases: Struct

Represents metadata associated with audio generation.

This class will eventually replace the metadata dictionary used throughout the AudioGenerationOutput object, providing a structured and type-safe alternative for audio generation metadata.

Parameters:

  • sample_rate (int | None) – The sample rate of the generated audio in Hz.
  • duration (float | None) – The duration of the generated audio in seconds.
  • chunk_id (int | None) – Identifier for the audio chunk (useful for streaming).
  • timestamp (str | None) – Timestamp when the audio was generated.
  • final_chunk (bool | None) – Whether this is the final chunk in a streaming sequence.
  • model_name (str | None) – Name of the model used for generation.
  • request_id (RequestID | None) – Unique identifier for the generation request.
  • tokens_generated (int | None) – Number of tokens generated for this audio.
  • processing_time (float | None) – Time taken to process this audio chunk in seconds.
  • echo (str | None) – Echo of the input prompt or identifier for verification.

chunk_id

chunk_id: int | None

source

duration

duration: float | None

source

echo

echo: str | None

source

final_chunk

final_chunk: bool | None

source

model_name

model_name: str | None

source

processing_time

processing_time: float | None

source

request_id

request_id: RequestID | None

source

sample_rate

sample_rate: int | None

source

timestamp

timestamp: str | None

source

to_dict()

to_dict()

source

Convert the metadata to a dictionary format.

Returns:

Dictionary representation of the metadata.

Return type:

dict[str, int | float | str | bool]

tokens_generated

tokens_generated: int | None

source