Python class
ReasoningParser
ReasoningParser
class max.interfaces.ReasoningParser
Bases: ABC
Parser for identifying reasoning spans in model output.
from_tokenizer()
abstract async classmethod from_tokenizer(tokenizer)
Constructs a reasoning parser from a tokenizer.
-
Parameters:
-
tokenizer (PipelineTokenizer[Any, Any, Any]) – The
PipelineTokenizerto use for resolving reasoning delimiter token IDs. -
Returns:
-
A new
ReasoningParserinstance. -
Return type:
is_prompt_in_reasoning()
is_prompt_in_reasoning(prompt_token_ids)
Decide whether the next generated token continues a reasoning span.
Called once at turn initiation, given the full prompt token ids (including any chat-template prefill). The result is used to seed the streaming reasoning state machine before the model emits its first token.
Multi-turn prompts can legitimately contain </think> tokens
from prior assistant turns. The default implementation delegates
to stream(), which scans left-to-right and would treat any
such stale </think> as “reasoning has ended” — incorrect for
the new assistant turn. Architectures whose chat templates emit
reasoning delimiters per turn should override this to consider
only the most recent delimiter (e.g., a right-to-left scan).
reset()
reset()
Resets per-request state.
Called at the start of each request to clear any internal state accumulated during a prior request.
-
Return type:
-
None
stream()
abstract stream(delta_token_ids)
Identifies a reasoning span within a streaming delta chunk.
-
Parameters:
-
delta_token_ids (Sequence[int]) – The token IDs of the incremental streaming chunk.
-
Returns:
-
A
ParsedReasoningDeltacontaining the reasoning span, whether reasoning is still active, and an optional formatter for decoded reasoning text. -
Return type:
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!