Mojo function
output_reg_to_smem_st_matrix
output_reg_to_smem_st_matrix[output_type: DType, accum_type: DType, num_m_mmas: Int, o_frag_size: Int, //, BM: Int, padded_depth: Int, swizzle: Swizzle, num_consumer: Int](warp_group_thread_idx: UInt32, local_warp_group_idx: UInt32, output_reg_tile: LayoutTensor[accum_type, Layout.row_major(num_m_mmas, o_frag_size), MutAnyOrigin, address_space=AddressSpace.LOCAL], accum_smem_tile: LayoutTensor[output_type, Layout.row_major(BM, padded_depth), MutAnyOrigin, address_space=AddressSpace.SHARED])
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!