For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
DistributedBroadcast
struct DistributedBroadcast
Distributed broadcast: copy tensor from root GPU to all GPUs.
A single instance of this op handles all participating GPUs. It receives:
- input: The source tensor from the root GPU (P2P accessible)
- outputs: Destination tensors, one per GPU
- signal_buffers: Synchronization buffers for all participating GPUs
- dev_ctxs_input: Device contexts for all participating GPUs
Implemented traitsβ
AnyType,
ImplicitlyDestructible
Methodsβ
executeβ
static def execute[dtype: DType, rank: Int, root: Int, target: StringSlice[StaticConstantOrigin], _trace_name: StringSlice[StaticConstantOrigin]](outputs: VariadicTensors[Output, static_specs=outputs.static_specs], input: ManagedTensorSlice[Input, static_spec=input.static_spec], signal_buffers: VariadicTensors[MutableInput, static_specs=signal_buffers.static_specs], dev_ctxs_input: DeviceContextList)
Execute distributed broadcast operation.
Limitations: - Maximum of 8 GPUs supported (MAX_GPUS). - Requires P2P access between GPUs (NVLink or PCIe P2P).
Parameters:
- βdtype (
DType): Data type of the tensor. - βrank (
Int): Tensor rank (number of dimensions). - βroot (
Int): Index of the root GPU (source of data). - βtarget (
StringSlice[StaticConstantOrigin]): Target device string for tracing. - β_trace_name (
StringSlice[StaticConstantOrigin]): Trace name for profiling.
Args:
- βoutputs (
VariadicTensors[Output, static_specs=outputs.static_specs]): Output tensors (one per GPU) to store broadcast results. - βinput (
ManagedTensorSlice[Input, static_spec=input.static_spec]): Input tensor from root GPU (P2P accessible from all GPUs). - βsignal_buffers (
VariadicTensors[MutableInput, static_specs=signal_buffers.static_specs]): Synchronization buffers for cross-GPU coordination. - βdev_ctxs_input (
DeviceContextList): Device contexts for participating GPUs.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!