Mojo function
compute_mla_dispatch_scalars_runtime
compute_mla_dispatch_scalars_runtime(batch_size: Int, max_cache_valid_length: Int, q_max_seq_len: Int, num_heads: Int, is_fp8_kv: Bool, sm_count: Int) -> Tuple[Int, Int, Int]
Returns:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!