For the complete documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /max/get-started.md).
Python class
SyntheticRunner
SyntheticRunner
class max.pipelines.sampling.SyntheticRunner(session, device_ref, synthetic_acceptance_rate, num_speculative_tokens)
Bases: RejectionRunner
Synthetic acceptance sampler for benchmarking.
Replaces model-driven acceptance with per-position independent
Bernoulli draws calibrated so the mean joint acceptance across
num_speculative_tokens positions matches
synthetic_acceptance_rate. Actual draft/target logits are
ignored; real model quality is not measured.
A fresh seed is bound per call so RNG varies across executions; otherwise a single deterministic realization would dominate.
-
Parameters:
-
- session (InferenceSession)
- device_ref (DeviceRef)
- synthetic_acceptance_rate (float)
- num_speculative_tokens (int)
run()
run(draft_tokens, draft_logits, target_logits, target_logit_offsets, all_draft_logits, context_batch)
Runs the synthetic acceptance graph with a fresh per-call seed.
draft_logits, target_logit_offsets, all_draft_logits,
and context_batch are ignored; synthetic acceptance uses only
draft_tokens and target_logits (for the recovered/bonus
argmax).
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!