Skip to main content

MAX Engine Python API

You can run an inference with our Python API in just a few lines of code:

  1. Create an InferenceSession.
  2. Load a TensorFlow or PyTorch model with InferenceSession.load(), which returns a Model.
  3. Run the model by passing your input to Model.execute(), which returns the output.

That’s it! For more detail, see how to run inference with Python.

InferenceSession

class max.engine.InferenceSession(num_threads: int | None = None, **kwargs)

Manages an inference session in which you can load and run models.

You need an instance of this to load a model as a Model object. For example:

session = engine.InferenceSession()
model_path = Path('bert-base-uncased')
model = session.load(model_path)
  • Parameters:

    num_threads (Optional[int]) – Number of threads to use for the inference session. This parameter defaults to the number of physical cores on your machine.

load()

load(model_path: str | Path, *options: TensorFlowLoadOptions | TorchLoadOptions | CommonLoadOptions, **kwargs) → Model

Loads a trained model and compiles it for inference.

Note: PyTorch models must be in TorchScript format, and TensorFlow models must be in SavedModel format. Or pass any ONNX model.

  • Parameters:

  • Returns:

    The loaded model, compiled and ready to execute.

  • Return type:

    Model

  • Raises:

    RuntimeError – If the path provided is invalid.

Model

class max.engine.Model

A loaded model that you can execute.

Do not instantiate this class directly. Instead, create it with InferenceSession.

execute()

execute(*args, **kwargs) → dict[str, ndarray | dict | list | tuple]

Executes the model with the provided input and returns the outputs.

For example, if the model has one input tensor named “input”:

input_tensor = np.random.rand(1, 224, 224, 3)
model.execute(input=input_tensor)
  • Parameters:

    • args – Currently not supported. You must specify inputs using kwargs.
    • kwargs – The input tensors, each specified with the appropriate tensor name as a keyword and its value as an ndarray. You can find the tensor names to use as keywords from input_metadata.
  • Returns:

    A dictionary of output values, each as an ndarray, Dict, List, or Tuple identified by its output name.

  • Return type:

    Dict

  • Raises:

    • RuntimeError – If the given input tensors’ name and shape don’t match what the model expects.
    • TypeError – If the given input tensors’ dtype cannot be cast to what the model expects.

input_metadata

property input_metadata*: list[TensorSpec]*

Metadata about the model’s input tensors, as a list of TensorSpec objects.

For example, you can print the input tensor names, shapes, and dtypes:

for tensor in model.input_metadata:
print(f'name: {tensor.name}, shape: {tensor.shape}, dtype: {tensor.dtype}')

output_metadata

property output_metadata*: list[TensorSpec]*

Metadata about the model’s output tensors, as a list of TensorSpec objects.

For example, you can print the output tensor names, shapes, and dtypes:

for tensor in model.ouput_metadata:
print(f'name: {tensor.name}, shape: {tensor.shape}, dtype: {tensor.dtype}')

TensorSpec

class max.engine.TensorSpec(shape: list[int | None] | None, dtype: DType, name: str)

Defines the properties of a tensor, including its name, shape and data type.

For usage examples, see Model.input_metadata.

dtype

property dtype*: DType*

A tensor data type.

name

property name*: str*

A tensor name.

shape

property shape*: list[int] | None*

The shape of the tensor as a list of integers.

If a dimension size is unknown/dynamic (such as the batch size), its value is None.

TensorFlowLoadOptions

class max.engine.TensorFlowLoadOptions(exported_name: str = 'serving_default', type: str = 'tf')

Configures how to load TensorFlow models.

exported_name

exported_name*: str* = 'serving_default'

The exported name from the TensorFlow model’s signature.

type

type*: str* = 'tf'

TorchLoadOptions

class max.engine.TorchLoadOptions(input_specs: list[TorchInputSpec] = <factory>, type: str = 'torch')

Configures how to load TorchScript models.

input_specs

input_specs*: list[TorchInputSpec]*

The tensor specifications (shape and data type) for each of the model inputs. This is required when loading serialized TorchScript models because they do not include type and shape annotations.

If the model supports an input with dynamic shapes, use None as the dimension size in shape.

For example:

session = engine.InferenceSession()
torch_options = engine.TorchLoadOptions()
torch_options.input_specs = [
engine.TorchInputSpec(
shape=[1, 16], dtype=engine.DType.int32
),
engine.TorchInputSpec(
shape=[1, 3, 224, 224], dtype=engine.DType.float32
),
engine.TorchInputSpec(
shape=[1, 16], dtype=engine.DType.int32
),
]
model = session.load("clip-vit.torchscript", torch_options)

type

type*: str* = 'torch'

TorchInputSpec

class max.engine.TorchInputSpec(shape: list[int | None] | None, dtype: DType)

Specifies valid input specification for a TorchScript model.

Before you load a TorchScript model, you must create an instance of this class for each input tensor, and pass it to TorchLoadOptions.

For example code, see TorchLoadOptions.

dtype

property dtype*: DType*

A torch input tensor data type.

shape

property shape*: list[int] | None*

The shape of the torch input tensor as a list of integers.

If a dimension size is unknown/dynamic (such as the batch size), the shape should be None.

CommonLoadOptions

class max.engine.CommonLoadOptions(custom_ops_path: str = '')

Common options for how to load models.

custom_ops_path

custom_ops_path*: str* = ''

The path to your custom ops. (This feature is coming soon.)

DType

class max.engine.DType(value)

The tensor data type.

bool

bool = 0

int8

int8 = 1

int16

int16 = 2

int32

int32 = 3

int64

int64 = 4

uint8

uint8 = 5

uint16

uint16 = 6

uint32

uint32 = 7

uint64

uint64 = 8

float16

float16 = 9

float32

float32 = 10

float64

float64 = 11

unknown

unknown = 12