For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

varlen_selective_state_update_gpu

def varlen_selective_state_update_gpu[kernel_dtype: DType, DSTATE: Int, state_LT: TensorLayout, x_LT: TensorLayout, dt_LT: TensorLayout, A_LT: TensorLayout, B_LT: TensorLayout, C_LT: TensorLayout, D_LT: TensorLayout, z_LT: TensorLayout, output_LT: TensorLayout, dt_bias_LT: TensorLayout, state_batch_indices_LT: TensorLayout](total_threads: Int, batch: Int, nheads: Int, dim: Int, nheads_ngroups_ratio: Int, pad_slot_id: Int32, dt_softplus: Int8, has_state_batch_indices: Int8, state: TileTensor[kernel_dtype, state_LT, MutUntrackedOrigin], x: TileTensor[kernel_dtype, x_LT, MutUntrackedOrigin], dt: TileTensor[kernel_dtype, dt_LT, MutUntrackedOrigin], A: TileTensor[kernel_dtype, A_LT, MutUntrackedOrigin], B: TileTensor[kernel_dtype, B_LT, MutUntrackedOrigin], C: TileTensor[kernel_dtype, C_LT, MutUntrackedOrigin], D: TileTensor[kernel_dtype, D_LT, MutUntrackedOrigin], z: TileTensor[kernel_dtype, z_LT, MutUntrackedOrigin], output: TileTensor[kernel_dtype, output_LT, MutUntrackedOrigin], dt_bias: TileTensor[kernel_dtype, dt_bias_LT, MutUntrackedOrigin], state_batch_indices: TileTensor[DType.int32, state_batch_indices_LT, MutUntrackedOrigin], state_strides: IndexList[Int(4)], x_strides: IndexList[Int(3)], dt_strides: IndexList[Int(3)], dt_bias_strides: IndexList[Int(2)], A_strides: IndexList[Int(3)], B_strides: IndexList[Int(3)], C_strides: IndexList[Int(3)], D_strides: IndexList[Int(2)], z_strides: IndexList[Int(3)], out_strides: IndexList[Int(3)])

GPU kernel for selective state update with multi-head support.