Inference Engine Python API
This is a preview of the Modular Inference Engine. It is not publicly available yet and APIs are subject to change.
If you’re interested, please sign up for early access.
Executing a model with the Modular Inference Engine is easy:
Create an
InferenceSession
and load a TensorFlow or PyTorch model withInferenceSession.load()
.Then run it by calling
Model.execute()
and passing your input data. This function returns the model output.
That’s it.
To see some code in action, see the Python get started guide.
- class modular.engine.InferenceSession(num_threads: Optional[int] = None)¶
Manages an inference session in which you can load and run models.
- Parameters:
num_threads (Optional[int]) – Number of threads to use for the inference session. This parameter defaults to the number of physical cores on your machine.
- load(model_path: Union[str, Path]) Model ¶
Loads a trained model and compiles it for inference.
- Parameters:
model_path (Union[str, pathlib.Path]) – Path to a model. May be a TensorFlow model in the SavedModel format or a traceable PyTorch model.
- Returns:
The loaded model, compiled and ready to execute.
- Return type:
- Raises:
RuntimeError – If the path provided is invalid.
- class modular.engine.Model¶
A loaded model that you can execute.
You should not instantiate this class directly. Instead, create a
Model
by passing your model file toInferenceSession.load()
. Then you can run the model by passing your input data toModel.execute()
.- execute(*args) None ¶
Executes the model with the provided input and returns outputs.
- Parameters:
*args – Input tensors as
ndarray
data.- Returns:
Output tensors.
- Return type:
- Raises:
RuntimeError – If the input tensors don’t match what the model expects.
- property input_metadata: List[TensorSpec]¶
Metadata about the input tensors that the model accepts.
You can use this to query the tensor shapes and data types like this:
for tensor in model.input_metadata: print(f'shape: {tensor.shape}, dtype: {tensor.dtype}')
- class modular.engine.TensorSpec¶
Defines the properties of a tensor, namely its shape and data type.
You can get a list of
TensorSpec
objects that specify the input tensors of aModel
fromModel.input_metadata()
.