Mojo function
allreduce_2stage_quickreduce
allreduce_2stage_quickreduce[dtype: DType, rank: Int, ngpus: Int, *, BLOCK_SIZE: Int, output_lambda: fn[dtype: DType, rank: Int, width: Int, *, alignment: Int](IndexList[rank], SIMD[dtype, width]) capturing -> None, atom_size: Int](result: NDBuffer[dtype, rank, MutableAnyOrigin], local_src: UnsafePointer[Scalar[dtype]], rank_sigs: InlineArray[UnsafePointer[Signal], 8], num_elements: Int, my_rank: Int, iteration: Int, num_tiles_total: Int)
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!