Python module

lora

`AttentionWithRopeAndLoRA`

class max.nn.legacy.lora.AttentionWithRopeAndLoRA(*, rope, num_attention_heads, num_key_value_heads, hidden_size, kv_params, max_lora_rank, max_num_loras, devices=None, dtype=float32, linear_cls=<class 'max.nn.legacy.linear.Linear'>, stacked_qkv=False, scale=None, has_bias=False, float8_config=None, clip_qkv=None)

Parameters:

rope (RotaryEmbedding)
num_attention_heads (int)
num_key_value_heads (int)
hidden_size (int)
kv_params (KVCacheParams)
max_lora_rank (int)
max_num_loras (int)
devices (list[DeviceRef] | None)
dtype (DType)
linear_cls (Callable[..., Linear])
stacked_qkv (bool)
scale (float | None)
has_bias (bool)
float8_config (Float8Config | None)
clip_qkv (float | None)

`rope`

rope: RotaryEmbedding

`LinearLoRA`

class max.nn.legacy.lora.LinearLoRA(in_dim, out_dim, max_num_loras, max_lora_rank, dtype, device, has_lora_bias=False, name=None, quantization_encoding=None)

Parameters:

in_dim (int)
out_dim (int)
max_num_loras (int)
max_lora_rank (int)
dtype (DType)
device (DeviceRef)
has_lora_bias (bool)
name (str | None)
quantization_encoding (QuantizationEncoding | None)

`set_lora_batch_info()`

set_lora_batch_info(lora_ids, lora_ranks, lora_grouped_offsets, num_active_loras, lora_end_idx, batch_seq_len, lora_ids_kv, lora_grouped_offsets_kv)

Parameters:

lora_ids (TensorValue)
lora_ranks (TensorValue)
lora_grouped_offsets (TensorValue)
num_active_loras (TensorValue)
lora_end_idx (TensorValue)
batch_seq_len (TensorValue)
lora_ids_kv (TensorValue)
lora_grouped_offsets_kv (TensorValue)

Return type:

None

`SupportsLoRA`

class max.nn.legacy.lora.SupportsLoRA(*args, **kwargs)

Base class for supporting LoRA functionality in Modules

`set_lora_batch_info()`

set_lora_batch_info(lora_ids, lora_ranks, lora_grouped_offsets, num_active_loras, lora_end_idx, batch_seq_len, lora_ids_kv, lora_grouped_offsets_kv)

Parameters:

lora_ids (TensorValue)
lora_ranks (TensorValue)
lora_grouped_offsets (TensorValue)
num_active_loras (TensorValue)
lora_end_idx (TensorValue)
batch_seq_len (TensorValue)
lora_ids_kv (TensorValue)
lora_grouped_offsets_kv (TensorValue)

Return type:

None

AttentionWithRopeAndLoRA​

rope​

LinearLoRA​

set_lora_batch_info()​

SupportsLoRA​

set_lora_batch_info()​

`AttentionWithRopeAndLoRA`

`rope`

`LinearLoRA`

`set_lora_batch_info()`

`SupportsLoRA`

`set_lora_batch_info()`