Skip to main content

Mojo function

fused_qk_rope_ragged

fused_qk_rope_ragged[dtype: DType, freq_dtype: DType, collection_t: KVCollectionT, //, cache_t: KVCacheT, *, interleaved: Bool, target: StringSlice[StaticConstantOrigin], mrope_types: Variadic[CoordLike] = , mrope_section: Optional[Coord[mrope_types]] = None, PositionIdsLayoutType: TensorLayout = Layout[RuntimeInt[DType.int64], RuntimeInt[DType.int64], RuntimeInt[DType.int64], ComptimeInt[1]]](q_proj: TileTensor[dtype, LayoutType, origin, address_space=address_space, linear_idx_type=linear_idx_type, element_shape_types=element_shape_types], input_row_offsets: TileTensor[DType.uint32, LayoutType, origin, address_space=address_space, linear_idx_type=linear_idx_type, element_shape_types=element_shape_types], kv_collection: collection_t, freqs_cis: TileTensor[freq_dtype, LayoutType, origin, address_space=address_space, linear_idx_type=linear_idx_type, element_shape_types=element_shape_types], position_ids: OptionalReg[TileTensor[DType.uint32, PositionIdsLayoutType, ImmutAnyOrigin]], layer_idx: UInt32, output: TileTensor[dtype, LayoutType, origin, address_space=address_space, linear_idx_type=linear_idx_type, element_shape_types=element_shape_types], context: Optional[DeviceContext])

Applies RoPE (Rotary Position Embedding) to query and key tensors.

This function can applies RoPE only to the last rope_dim elements of each head, leaving the first unroped_dim elements unchanged. This is required for DeepSeek models where only part of each head undergoes rotary transformation.

Was this page helpful?