Mojo function
store_fragment_to_smem
store_fragment_to_smem[vec_dtype: DType, vec_size: Int, //, swizzle: Swizzle, stageN: Int, transpose_c: Bool = False, c_swizzle: TensorMapSwizzle = TensorMapSwizzle.SWIZZLE_128B](vec: InlineArray[Scalar[vec_dtype], vec_size], dst: TileTensor[dst.dtype, dst.LayoutType, dst.origin, address_space=AddressSpace.SHARED, linear_idx_type=dst.linear_idx_type, element_size=dst.element_size], warp_offset: UInt32 = UInt32(0))
Store fragment to SMEM via st.matrix instruction for bf16 output type and st.shared instruction for FP8 output type.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!