For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo function
apply_qk_rms_norm_cpu
def apply_qk_rms_norm_cpu[in_dtype: DType, out_dtype: DType, //](q_out: TileTensor[out_dtype, Storage=q_out.Storage, address_space=q_out.address_space, linear_idx_type=q_out.linear_idx_type, element_size=q_out.element_size], k_out: TileTensor[out_dtype, Storage=k_out.Storage, address_space=k_out.address_space, linear_idx_type=k_out.linear_idx_type, element_size=k_out.element_size], gamma_q: TileTensor[DType.float32, Storage=gamma_q.Storage, address_space=gamma_q.address_space, linear_idx_type=gamma_q.linear_idx_type, element_size=gamma_q.element_size], gamma_k: TileTensor[DType.float32, Storage=gamma_k.Storage, address_space=gamma_k.address_space, linear_idx_type=gamma_k.linear_idx_type, element_size=gamma_k.element_size], qk_var: TileTensor[DType.float32, Storage=qk_var.Storage, address_space=qk_var.address_space, linear_idx_type=qk_var.linear_idx_type, element_size=qk_var.element_size], q: TileTensor[in_dtype, Storage=q.Storage, address_space=q.address_space, linear_idx_type=q.linear_idx_type, element_size=q.element_size], k: TileTensor[in_dtype, Storage=k.Storage, address_space=k.address_space, linear_idx_type=k.linear_idx_type, element_size=k.element_size], epsilon: Float32, rows: Int, q_cols: Int, k_cols: Int)
Naive CPU reference path (also used as a correctness oracle).
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!