Mojo struct
SplitKWorkspace
struct SplitKWorkspace[num_splits: Int]
Pre-allocated scratch for repeated split-K launches.
Allocate once per (M, N) and pass to amd_4wave_split_k_matmul.
The buffer must hold num_splits * M * N float32 elements.
Parametersβ
- βnum_splits (
Int): Number of K-splits the workspace must hold.
Fieldsβ
- βscratch (
DeviceBuffer[DType.float32]): Backing float32 device buffer of sizenum_splits * elems_per_split.
Implemented traitsβ
AnyType,
Copyable,
ImplicitlyCopyable,
ImplicitlyDestructible,
Movable
Methodsβ
__init__β
__init__(out self, ctx: DeviceContext, elems_per_split: Int)
Allocates the per-split scratch buffer on the device.
Args:
- βctx (
DeviceContext): Device context used to allocate the buffer. - βelems_per_split (
Int): Number of float32 elements per K-split (typicallyM * N).
Raises:
An error if device allocation fails.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!