Skip to main content

Mojo function

fused_qk_rope_ragged

fused_qk_rope_ragged[dtype: DType, freq_dtype: DType, collection_t: KVCollectionT, //, cache_t: KVCacheT, *, interleaved: Bool, target: StringSlice[StaticConstantOrigin], mrope_section: Optional[IntTuple[__origin_of()]] = None](q_proj: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], input_row_offsets: LayoutTensor[DType.uint32, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], kv_collection: collection_t, freqs_cis: LayoutTensor[freq_dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], position_ids: OptionalReg[LayoutTensor[DType.uint32, Layout.row_major[2](), MutableAnyOrigin]], layer_idx: UInt32, output: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], context: Optional[DeviceContext])

Applies RoPE (Rotary Position Embedding) to query and key tensors.

This function can applies RoPE only to the last rope_dim elements of each head, leaving the first unroped_dim elements unchanged. This is required for DeepSeek models where only part of each head undergoes rotary transformation.

Was this page helpful?