Skip to main content

Mojo struct

SplitKWorkspace

struct SplitKWorkspace[num_splits: Int]

Pre-allocated scratch for repeated split-K launches.

Allocate once per (M, N) and pass to amd_4wave_split_k_matmul. The buffer must hold num_splits * M * N float32 elements.

Parameters​

  • ​num_splits (Int): Number of K-splits the workspace must hold.

Fields​

  • ​scratch (DeviceBuffer[DType.float32]): Backing float32 device buffer of size num_splits * elems_per_split.

Implemented traits​

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable

Methods​

__init__​

__init__(out self, ctx: DeviceContext, elems_per_split: Int)

Allocates the per-split scratch buffer on the device.

Args:

  • ​ctx (DeviceContext): Device context used to allocate the buffer.
  • ​elems_per_split (Int): Number of float32 elements per K-split (typically M * N).

Raises:

An error if device allocation fails.