Mojo function
sliced_add
sliced_add[dtype: DType, //, target: StringSlice[StaticConstantOrigin]](c: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], a: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], b: LayoutTensor[dtype, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], lora_end_idx: LayoutTensor[DType.int64, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], ctx: Optional[DeviceContext])
Adds tensors a and b element-wise for rows < lora_end_idx, otherwise copies a.
This is used for LoRA where only some sequences have LoRA applied. For rows in [0, lora_end_idx): c = a + b For rows in [lora_end_idx, batch_seq_len): c = a
Args:
- c (
LayoutTensor): Output tensor. - a (
LayoutTensor): First input tensor. - b (
LayoutTensor): Second input tensor. - lora_end_idx (
LayoutTensor): Scalar tensor with end index of LoRA token portion (rows to apply add). - ctx (
Optional): Device context for GPU operations.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!