IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /max/get-started.md). For the complete documentation index, see llms.txt.
Skip to main content
For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

causal_conv1d_channel_first_fwd_cpu

causal_conv1d_channel_first_fwd_cpu[x_dtype: DType, weight_dtype: DType, output_dtype: DType, bias_dtype: DType](batch: Int, dim: Int, seqlen: Int, width: Int, x: TileTensor[x_dtype, address_space=x.address_space, linear_idx_type=x.linear_idx_type, element_size=x.element_size], weight: TileTensor[weight_dtype, address_space=weight.address_space, linear_idx_type=weight.linear_idx_type, element_size=weight.element_size], output: TileTensor[output_dtype, address_space=output.address_space, linear_idx_type=output.linear_idx_type, element_size=output.element_size], bias: TileTensor[bias_dtype, address_space=bias.address_space, linear_idx_type=bias.linear_idx_type, element_size=bias.element_size], x_batch_stride: UInt32, x_c_stride: UInt32, x_l_stride: UInt32, weight_c_stride: UInt32, weight_width_stride: UInt32, out_batch_stride: UInt32, out_c_stride: UInt32, out_l_stride: UInt32, bias_stride: UInt32, silu_activation: Bool, ctx: Optional[DeviceContext] = None)

CPU implementation of causal conv1d for channel-first layout with bias.

Optimizations:

  1. Parallelization across batch*channel dimensions using sync_parallelize.
  2. Pre-loaded weights in registers to reduce memory access.

Args: