MAX Engine Python API
You can run an inference with our Python API in just a few lines of code:
- Create an
InferenceSession
. - Load a model with
InferenceSession.load()
, which returns aModel
. - Run the model by passing your input to
Model.execute()
, which returns the output.
That’s it! For more detail, see how to run inference with Python.
InferenceSession
class max.engine.InferenceSession(num_threads: int | None = None, **kwargs)
Manages an inference session in which you can load and run models.
You need an instance of this to load a model as a Model
object.
For example:
session = engine.InferenceSession()
model_path = Path('bert-base-uncased')
model = session.load(model_path)
-
Parameters:
num_threads (Optional[int]) – Number of threads to use for the inference session. This parameter defaults to the number of physical cores on your machine.
load()
load(model_path: str | Path, *, custom_ops_path: str | None = None, input_specs: list[TorchInputSpec] | None = None) → Model
Loads a trained model and compiles it for inference.
Note: PyTorch models must be in TorchScript format.
-
Parameters:
-
model_path (Union[str, pathlib.Path]) – Path to a model. May be a TorchScript model or an ONNX model.
-
custom_ops_path (str) – The path to your custom ops Mojo package.
-
input_specs –
The tensor specifications (shape and data type) for each of the model inputs. This is required when loading serialized TorchScript models because they do not include type and shape annotations.
If the model supports an input with dynamic shapes, use
None
as the dimension size inshape
.For example:
session = engine.InferenceSession()
model = session.load(
"clip-vit.torchscript",
input_specs = [
engine.TorchInputSpec(
shape=[1, 16], dtype=engine.DType.int32
),
engine.TorchInputSpec(
shape=[1, 3, 224, 224], dtype=engine.DType.float32
),
engine.TorchInputSpec(
shape=[1, 16], dtype=engine.DType.int32
),
],
)
-
-
Returns:
The loaded model, compiled and ready to execute.
-
Return type:
-
Raises:
RuntimeError – If the path provided is invalid.
Model
class max.engine.Model
A loaded model that you can execute.
Do not instantiate this class directly. Instead, create it with
InferenceSession
.
execute()
execute(*args, **kwargs) → dict[str, ndarray | dict | list | tuple]
Executes the model with the provided input and returns the outputs.
For example, if the model has one input tensor named “input”:
input_tensor = np.random.rand(1, 224, 224, 3)
model.execute(input=input_tensor)
-
Parameters:
- args – Currently not supported. You must specify inputs using
kwargs
. - kwargs – The input tensors, each specified with the appropriate tensor name
as a keyword and its value as an
ndarray
. You can find the tensor names to use as keywords frominput_metadata
.
- args – Currently not supported. You must specify inputs using
-
Returns:
A dictionary of output values, each as an
ndarray
,Dict
,List
, orTuple
identified by its output name. -
Return type:
Dict
-
Raises:
- RuntimeError – If the given input tensors’ name and shape don’t match what the model expects.
- TypeError – If the given input tensors’ dtype cannot be cast to what the model expects.
input_metadata
property input_metadata*: list[TensorSpec]*
Metadata about the model’s input tensors, as a list of
TensorSpec
objects.
For example, you can print the input tensor names, shapes, and dtypes:
for tensor in model.input_metadata:
print(f'name: {tensor.name}, shape: {tensor.shape}, dtype: {tensor.dtype}')
output_metadata
property output_metadata*: list[TensorSpec]*
Metadata about the model’s output tensors, as a list of
TensorSpec
objects.
For example, you can print the output tensor names, shapes, and dtypes:
for tensor in model.ouput_metadata:
print(f'name: {tensor.name}, shape: {tensor.shape}, dtype: {tensor.dtype}')
TensorSpec
class max.engine.TensorSpec(shape: list[int | None] | None, dtype: DType, name: str)
Defines the properties of a tensor, including its name, shape and data type.
For usage examples, see Model.input_metadata
.
dtype
property dtype*: DType*
A tensor data type.
name
property name*: str*
A tensor name.
shape
The shape of the tensor as a list of integers.
If a dimension size is unknown/dynamic (such as the batch size), its
value is None
.
TorchInputSpec
class max.engine.TorchInputSpec(shape: list[int | str] | None, dtype: DType)
Specifies valid input specification for a TorchScript model.
Before you load a TorchScript model, you must create an instance of this class
for each input tensor, and pass them to the input_specs argument of
InferenceSession.load()
.
For example code, see InferenceSession.load()
.
dtype
property dtype*: DType*
A torch input tensor data type.
shape
The shape of the torch input tensor as a list of integers.
If a dimension size is unknown/dynamic (such as the batch size), the
shape should be None
.
DType
class max.engine.DType(value)
The tensor data type.
bool
bool = 0
int8
int8 = 1
int16
int16 = 2
int32
int32 = 3
int64
int64 = 4
uint8
uint8 = 5
uint16
uint16 = 6
uint32
uint32 = 7
uint64
uint64 = 8
float16
float16 = 9
float32
float32 = 10
float64
float64 = 11
unknown
unknown = 12
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!
If you'd like to share more information, please report an issue on GitHub
😔 What went wrong?