IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

dispatch_amd_4wave_conv2d

def dispatch_amd_4wave_conv2d[input_type: DType, filter_type: DType, output_type: DType, filter_is_fcrs: Bool, has_residual: Bool = False, elementwise_lambda_fn: Optional[def[dtype: DType, width: Int, *, alignment: Int = 1](IndexList[2], SIMD[dtype, width]) capturing -> None] = None](input: TileTensor[input_type, address_space=input.address_space, linear_idx_type=input.linear_idx_type, element_size=input.element_size], filter: TileTensor[filter_type, address_space=filter.address_space, linear_idx_type=filter.linear_idx_type, element_size=filter.element_size], output: TileTensor[output_type, address_space=output.address_space, linear_idx_type=output.linear_idx_type, element_size=output.element_size], stride: IndexList[2], dilation: IndexList[2], symmetric_padding: IndexList[2], num_groups: Int, ctx: DeviceContext, source_ptr: Optional[UnsafePointer[Scalar[output_type], MutAnyOrigin]] = None, beta: Float32 = 0) -> Bool

Try to dispatch a Conv2D to amd_4wave_conv on MI355X.

Returns True if the convolution was handled; False if the caller should fall back (typically to MIOpen). See module docstring for the full acceptance criteria.

When has_residual=True and source_ptr is set, computes D = Conv(A, B) + beta * source via the in-kernel fused residual path (amd_4wave_conv[has_residual=True]). The source pointer is expected to point to an NHWC-contiguous buffer with the same shape as output. When has_residual=False (default), the call is identical to the no-residual variant — no extra ABI overhead beyond the launch packet's 16 bytes (DCE'd source_ptr / stride / beta).

Returns:

Bool