For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo struct

MLASparseConfig

struct MLASparseConfig[qkv_dtype: DType, b_topk_: Int = 128, num_mbars_: Int = 2, q_smem_depth_: Int = 192, q_tmem_depth_: Int = 384]

Fields

num_q_heads (Int):
num_kv_heads (Int):
qk_depth (Int):
v_depth (Int):
indices_stride (Int):
group (Int):

Implemented traits

AnyType, ImplicitlyDeletable

`comptime` members

`B_TOPK`

comptime B_TOPK = b_topk_

`cta_group`

comptime cta_group = 2

`k_swizzle_mode`

comptime k_swizzle_mode = TensorMapSwizzle.SWIZZLE_128B

`num_mbars`

comptime num_mbars = num_mbars_

`num_threads`

comptime num_threads = 512

`output_swizzle_mode`

comptime output_swizzle_mode = TensorMapSwizzle.SWIZZLE_128B

`q_smem_depth`

comptime q_smem_depth = q_smem_depth_

`q_swizzle_mode`

comptime q_swizzle_mode = TensorMapSwizzle.SWIZZLE_128B

`q_tmem_depth`

comptime q_tmem_depth = q_tmem_depth_

`qkv_dtype_size`

comptime qkv_dtype_size = size_of[qkv_dtype]()

`sm100_tmem_cols`

comptime sm100_tmem_cols = 512

Methods

`init`

def __init__(out self, *, num_q_heads: Int, num_kv_heads: Int, qk_depth: Int, v_depth: Int, indices_stride: Int, group: Int)

Fields​

Implemented traits​

comptime members​

B_TOPK​

cta_group​

k_swizzle_mode​

num_mbars​

num_threads​

output_swizzle_mode​

q_smem_depth​

q_swizzle_mode​

q_tmem_depth​

qkv_dtype_size​

sm100_tmem_cols​

Methods​

__init__​