For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).

Mojo function

causal_conv1d_update_cpu

def causal_conv1d_update_cpu[x_dtype: DType, conv_state_dtype: DType, weight_dtype: DType, output_dtype: DType, bias_dtype: DType](batch: Int, dim: Int, seqlen: Int, width: Int, state_len: Int, x: TileTensor[x_dtype, Storage=x.Storage, address_space=x.address_space, linear_idx_type=x.linear_idx_type], conv_state: TileTensor[conv_state_dtype, Storage=conv_state.Storage, address_space=conv_state.address_space, linear_idx_type=conv_state.linear_idx_type], weight: TileTensor[weight_dtype, Storage=weight.Storage, address_space=weight.address_space, linear_idx_type=weight.linear_idx_type], output: TileTensor[output_dtype, Storage=output.Storage, address_space=output.address_space, linear_idx_type=output.linear_idx_type], bias: TileTensor[bias_dtype, Storage=bias.Storage, address_space=bias.address_space, linear_idx_type=bias.linear_idx_type], x_batch_stride: UInt32, x_c_stride: UInt32, x_l_stride: UInt32, conv_state_batch_stride: UInt32, conv_state_c_stride: UInt32, conv_state_l_stride: UInt32, weight_c_stride: UInt32, weight_width_stride: UInt32, out_batch_stride: UInt32, out_c_stride: UInt32, out_l_stride: UInt32, silu_activation: Bool)

CPU implementation of causal conv1d update for incremental inference.

This kernel:

Concatenates conv_state with x to form a sliding window
Computes convolution output for the new positions
Updates conv_state with the new values from x

Simple mode (no circular buffer):

conv_state holds the last (state_len) values
New x values are appended, old values are shifted out

Args:

batch (Int): Batch size.
dim (Int): Number of channels.
seqlen (Int): Sequence length of input x (typically 1).
width (Int): Kernel width.
state_len (Int): Length of conv_state (>= width - 1).
x (TileTensor[x_dtype, Storage=x.Storage, address_space=x.address_space, linear_idx_type=x.linear_idx_type]): Input tensor.
conv_state (TileTensor[conv_state_dtype, Storage=conv_state.Storage, address_space=conv_state.address_space, linear_idx_type=conv_state.linear_idx_type]): Convolution state buffer (modified in-place).
weight (TileTensor[weight_dtype, Storage=weight.Storage, address_space=weight.address_space, linear_idx_type=weight.linear_idx_type]): Convolution weights.
output (TileTensor[output_dtype, Storage=output.Storage, address_space=output.address_space, linear_idx_type=output.linear_idx_type]): Output tensor.
bias (TileTensor[bias_dtype, Storage=bias.Storage, address_space=bias.address_space, linear_idx_type=bias.linear_idx_type]): Bias tensor.
x_batch_stride (UInt32): Stride for batch dimension in x.
x_c_stride (UInt32): Stride for channel dimension in x.
x_l_stride (UInt32): Stride for sequence length dimension in x.
conv_state_batch_stride (UInt32): Stride for batch dimension in conv_state.
conv_state_c_stride (UInt32): Stride for channel dimension in conv_state.
conv_state_l_stride (UInt32): Stride for state length dimension in conv_state.
weight_c_stride (UInt32): Stride for channel dimension in weight.
weight_width_stride (UInt32): Stride for kernel width dimension in weight.
out_batch_stride (UInt32): Stride for batch dimension in output.
out_c_stride (UInt32): Stride for channel dimension in output.
out_l_stride (UInt32): Stride for sequence length dimension in output.
silu_activation (Bool): Whether to apply SiLU activation.