For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Mojo module
varlen_causal_conv1d
Causal Conv1D with variable length sequence support (vLLM interface).
This module implements causal 1D convolution operations that support variable length sequences using cumulative sequence lengths (cu_seqlens), compatible with the vLLM inference interface.
Key Functions: - causal_conv1d_varlen_fwd: Forward pass for varlen sequences - causal_conv1d_varlen_update: Update function for decode - causal_conv1d_varlen_states: Extract states from varlen sequences
vLLM Interface: - x: (dim, cu_seq_len) for varlen - sequences concatenated left to right - query_start_loc: (batch + 1) int32 - cumulative sequence lengths - cache_indices: (batch) int32 - indices into conv_states - has_initial_state: (batch) bool - whether to use initial state - conv_states: (..., dim, width - 1) - states updated in-place - activation: None or "silu" or "swish" - pad_slot_id: int - for identifying padded entries
comptime valuesβ
PAD_SLOT_IDβ
comptime PAD_SLOT_ID = Int32(-1)
Functionsβ
- β
causal_conv1d_varlen_fwd_cpu: Forward pass for causal conv1d with variable length sequences. - β
causal_conv1d_varlen_fwd_gpu: GPU kernel for causal conv1d forward with variable length sequences. - β
causal_conv1d_varlen_states_cpu: Extract the last state_len elements from each variable length sequence. - β
causal_conv1d_varlen_states_gpu: GPU kernel for extracting states from variable length sequences. - β
causal_conv1d_varlen_update_cpu: Update function for causal conv1d decode. - β
causal_conv1d_varlen_update_gpu: GPU kernel for causal conv1d update (decode step).
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!