Mojo function
apply_penalties_to_logits
apply_penalties_to_logits[logit_type: DType, penalty_type: DType, //, target: StringSlice[StaticConstantOrigin]](logits: LayoutTensor[logit_type, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], compressed_frequency_data: LayoutTensor[int32, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], frequency_offsets: LayoutTensor[uint32, layout, origin, address_space=address_space, element_layout=element_layout, layout_int_type=layout_int_type, linear_idx_type=linear_idx_type, masked=masked, alignment=alignment], frequency_penalty: SIMD[penalty_type, 1], presence_penalty: SIMD[penalty_type, 1], ctx: DeviceContextPtr)
Apply penalties to the logits based on the frequency of the tokens in the batch.
The frequency data is stored in a CSR format, where the frequency_offsets is the starting index of each sequence in the frequency_data array. The frequency_data array is a 2D array, where:
- frequency_data[i, 0] is the token id
- frequency_data[i, 1] is the frequency of the token in the sequence
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!