Skip to main content
Log in

Python module

tokenizer

Implementations of provided tokenizers.

IdentityPipelineTokenizer

class max.pipelines.tokenizer.IdentityPipelineTokenizer(*args, **kwargs)

decode()

async decode(context: TokenGeneratorContext, encoded: str, **kwargs) → str

Decodes response tokens to text.

  • Parameters:

    • context (TokenGeneratorContext) – Current generation context.
    • encoded (TokenizerEncoded) – Encoded response tokens.
  • Returns:

    Un-encoded response text.

  • Return type:

    str

encode()

async encode(prompt: str) → str

Encodes text prompts as tokens.

  • Parameters:

    prompt (str) – Un-encoded prompt text.

  • Raises:

    ValueError – If the prompt exceeds the configured maximum length.

  • Returns:

    Encoded prompt tokens.

  • Return type:

    TokenizerEncoded

eos

property eos*: int*

The end of sequence token for this tokenizer.

expects_content_wrapping

property expects_content_wrapping*: bool*

If true, this tokenizer expects messages to have a ‘content’ property. Text messages are formatted as { “type” : “text”, “content” : “text content”} instead of, the OpenAI spec. { “type” : “text”, “text”: “text content” }. NOTE: Multimodal messages omit the content property. Both “image_urls” and “image” content parts are converted to simply { “type” : “image” } Their content is provided as byte arrays and by the top level property on the request object, i.e. “TokenGeneratorRequest.images”.

PreTrainedPipelineTokenizer

class max.pipelines.tokenizer.PreTrainedPipelineTokenizer(delegate: PreTrainedTokenizer | PreTrainedTokenizerFast)

apply_chat_template()

apply_chat_template(messages: list[max.pipelines.interfaces.text_generation.TokenGeneratorRequestMessage]) → str

decode()

async decode(context: TokenGeneratorContext, encoded: ndarray, **kwargs) → str

Decodes response tokens to text.

  • Parameters:

    • context (TokenGeneratorContext) – Current generation context.
    • encoded (TokenizerEncoded) – Encoded response tokens.
  • Returns:

    Un-encoded response text.

  • Return type:

    str

encode()

async encode(prompt: str) → ndarray

Encodes text prompts as tokens.

  • Parameters:

    prompt (str) – Un-encoded prompt text.

  • Raises:

    ValueError – If the prompt exceeds the configured maximum length.

  • Returns:

    Encoded prompt tokens.

  • Return type:

    TokenizerEncoded

eos

property eos*: int*

The end of sequence token for this tokenizer.

expects_content_wrapping

property expects_content_wrapping*: bool*

If true, this tokenizer expects messages to have a ‘content’ property. Text messages are formatted as { “type” : “text”, “content” : “text content”} instead of, the OpenAI spec. { “type” : “text”, “text”: “text content” }. NOTE: Multimodal messages omit the content property. Both “image_urls” and “image” content parts are converted to simply { “type” : “image” } Their content is provided as byte arrays and by the top level property on the request object, i.e. “TokenGeneratorRequest.images”.

TextAndVisionTokenizer

class max.pipelines.tokenizer.TextAndVisionTokenizer(config: PipelineConfig)

Encapsulates creation of TextContext and specific token encode/decode logic.

apply_chat_template()

apply_chat_template(messages: list[max.pipelines.interfaces.text_generation.TokenGeneratorRequestMessage]) → str

decode()

async decode(context: TextAndVisionContext, encoded: ndarray, **kwargs) → str

Transformer a provided encoded token array, back into readable text.

encode()

async encode(prompt: str | Sequence[int]) → ndarray

Transform the provided prompt into a token array.

eos

property eos*: int*

The end of sequence token for this tokenizer.

expects_content_wrapping

property expects_content_wrapping*: bool*

If true, this tokenizer expects messages to have a ‘content’ property. Text messages are formatted as { “type” : “text”, “content” : “text content”} instead of, the OpenAI spec. { “type” : “text”, “text”: “text content” }. NOTE: Multimodal messages omit the content property. Both “image_urls” and “image” content parts are converted to simply { “type” : “image” } Their content is provided as byte arrays and by the top level property on the request object, i.e. “TokenGeneratorRequest.images”.

new_context()

async new_context(request: TokenGeneratorRequest) → TextAndVisionContext

Create a new TextAndVisionContext object, leveraging necessary information like cache_seq_id and prompt from TokenGeneratorRequest.

TextTokenizer

class max.pipelines.tokenizer.TextTokenizer(config: PipelineConfig, enable_llama_whitespace_fix: bool = False)

Encapsulates creation of TextContext and specific token encode/decode logic.

apply_chat_template()

apply_chat_template(messages: list[max.pipelines.interfaces.text_generation.TokenGeneratorRequestMessage], tools: list[max.pipelines.interfaces.text_generation.TokenGeneratorRequestTool] | None) → str

decode()

async decode(context: TextContext, encoded: ndarray, **kwargs) → str

Transformer a provided encoded token array, back into readable text.

encode()

async encode(prompt: str | Sequence[int]) → ndarray

Transform the provided prompt into a token array.

eos

property eos*: int*

The end of sequence token for this tokenizer.

expects_content_wrapping

property expects_content_wrapping*: bool*

If true, this tokenizer expects messages to have a ‘content’ property. Text messages are formatted as { “type” : “text”, “content” : “text content”} instead of, the OpenAI spec. { “type” : “text”, “text”: “text content” }. NOTE: Multimodal messages omit the content property. Both “image_urls” and “image” content parts are converted to simply { “type” : “image” } Their content is provided as byte arrays and by the top level property on the request object, i.e. “TokenGeneratorRequest.images”.

new_context()

async new_context(request: TokenGeneratorRequest) → TextContext

Create a new TextContext object, leveraging necessary information like cache_seq_id and prompt from TokenGeneratorRequest.

max_tokens_to_generate()

max.pipelines.tokenizer.max_tokens_to_generate(prompt_size: int, max_length: int, max_new_tokens: int = -1) → int

Returns the max number of new tokens to generate.

run_with_default_executor()

async max.pipelines.tokenizer.run_with_default_executor(fn, *args)