For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo struct
K2qCsrDeviceSizes
struct K2qCsrDeviceSizes
Host-computed sizing for the device CSR (allocated by the caller).
Fieldsβ
- βbatch (
Int): - βtotal_rows (
Int): - βmax_kv_blocks (
Int): - βwork_capacity (
Int): - βg (
Int): CTAs over the q-range (hist/scatter grid.x). - βkwarps (
Int): Warps per CTA; each owns a contiguous q-sub-range. - βg_total (
Int): Number of units = g * kwarps (the tile_counts unit axis length). - βq_per_cta (
Int): Queries per CTA (ceil(total_q / g)). - βq_per_warp (
Int): Queries per warp (ceil(q_per_cta / kwarps)).
Implemented traitsβ
AnyType,
Copyable,
ImplicitlyDeletable,
Movable
Methodsβ
tile_counts_lenβ
def tile_counts_len(self, head_kv: Int) -> Int
Length of the tile_counts scratch buffer (the caller allocates it).
Returns:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!