Mojo module

reducescatter

Multi-GPU reducescatter implementation for distributed tensor reduction across GPUs.

`comptime` values

`elementwise_epilogue_type`

comptime elementwise_epilogue_type = def[dtype: DType, width: Int, *, alignment: Int, ?, .element_types`0x2: Variadic[CoordLike]](Coord[element_types], SIMD[dtype, width]) capturing -> None

Structs

ReduceScatterConfig: Configuration for axis-aware reduce-scatter partitioning.

Functions

reducescatter: Per-device reducescatter operation with axis-aware scatter.

comptime values​

elementwise_epilogue_type​

Structs​

Functions​

`comptime` values

`elementwise_epilogue_type`

Structs

Functions