Mojo function
shared_memory_epilogue
shared_memory_epilogue[MMA_M: UInt, data_paths: UInt, num_stages: UInt, stage: UInt, stageN: UInt, c_type: DType, shared_n: UInt, simd_size: UInt, c_smem_upper_layout: Layout, c_smem_lower_layout: Layout, swizzle: Swizzle, compute_lambda_fn: elementwise_compute_lambda_type, num_output_warps: UInt](M: UInt32, N: UInt32, c_col: UInt, c_row: UInt, c_smem_warp_tile_upper: LayoutTensor[c_type, c_smem_upper_layout, MutAnyOrigin, address_space=AddressSpace.SHARED, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], c_smem_warp_tile_lower: LayoutTensor[c_type, c_smem_lower_layout, MutAnyOrigin, address_space=AddressSpace.SHARED, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment])
Apply element-wise epilogue to non-transposed shared memory tile.
Handles the non-transpose case for SMEM-based epilogue. Processes upper and lower fragments separately with proper coordinate mapping.
Template Parameters: MMA_M: MMA M dimension. data_paths: Number of data paths (typically 16). num_stages: Total number of output stages. stage: Current output stage index. stageN: Stage width in elements. c_type: Output data type. shared_n: Shared memory N dimension. simd_size: SIMD width for vectorized access. c_smem_upper_layout: Layout for upper fragment tile. c_smem_lower_layout: Layout for lower fragment tile. swizzle: Swizzle pattern for SMEM access. compute_lambda_fn: Element-wise compute function. num_output_warps: Number of warps participating.
Args:
- M (
UInt32): Output M dimension. - N (
UInt32): Output N dimension. - c_col (
UInt): Base column coordinate. - c_row (
UInt): Base row coordinate. - c_smem_warp_tile_upper (
LayoutTensor): Upper fragment shared memory tile. - c_smem_warp_tile_lower (
LayoutTensor): Lower fragment shared memory tile.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!