Model
This is a preview of the Modular Inference Engine. It is not publicly available yet and APIs are subject to change.
If you’re interested, please sign up for early access.
#include "modular/c/model.h"
Functions
-
M_CompileConfig *M_newCompileConfig()¶
Creates an object you can use to configure model compilation.
- Returns:
A pointer to a new compilation configuration. You are responsible for the memory associated with the pointer returned. You can deallocate the memory by calling
M_freeCompileConfig()
.
-
void M_setModelPath(M_CompileConfig *compileConfig, const char *path)¶
Sets the path to a model.
- Parameters:
compileConfig – The compilation configuration for your model.
path – The path to your model. The model does not need to exist on the filesystem at this point. This follows the same semantics and expectations as
std::filesystem::path
.
-
M_AsyncCompiledModel *M_compileModel(const M_RuntimeContext *context, const M_CompileConfig *compileConfig, M_Status *status)¶
Compiles a model.
This function returns immediately with compilation happening asynchronously.
You should call
M_setModelPath()
before you call this.- Parameters:
context – The runtime context.
compileConfig – The compilation configuration for your model.
status – The status used to report errors in the case of failures during model compilation.
- Returns:
A pointer to an
M_AsyncCompiledModel
. You are responsible for the memory associated with the pointer returned. You can deallocate the memory by callingM_freeCompiledModel()
. If the config is invalid, it returns aNULL
pointer. If the model compilation fails, the pointer isNULL
and thestatus
parameter contains an error message.
-
M_AsyncCompiledModel *M_compileModelSync(const M_RuntimeContext *context, const M_CompileConfig *compileConfig, M_Status *status)¶
Compiles a model.
This operation is blocking and waits until the output results are ready.
You should call
M_setModelPath()
before you call this.- Parameters:
context – The runtime context.
compileConfig – The compilation configuration for your model.
status – The status used to report errors in the case of failures during model compilation.
- Returns:
A pointer to an
M_AsyncCompiledModel
which is in a resolved state. You are responsible for the memory associated with the pointer returned. You can deallocate the memory by callingM_freeCompiledModel()
. If the config is invalid, it returns aNULL
pointer. If the model compilation fails, the pointer isNULL
and thestatus
parameter contains an error message.
-
void M_waitForCompilation(M_AsyncCompiledModel *compiledModel, M_Status *status)¶
Blocks execution until the model is compiled.
The function waits for the async compiled model to be complete. When the function returns, the model is resolved to either a compiled model or an error.
- Parameters:
compiledModel – The model received from
M_compileModel()
.status – The status used to report errors in the case of failures.
-
M_AsyncModel *M_initModelSync(const M_RuntimeContext *context, const M_AsyncCompiledModel *compiledModel, M_Status *status)¶
Sets up a model for execution.
This operation is blocking and waits until model initialization completes.
You should call
M_compileModel()
before calling this.- Parameters:
context – The runtime context.
compiledModel – The compiled model.
status – The status used to report errors in the case of failures. The status contains an error if the given context or compiled model is invalid or if initialization failed.
- Returns:
A pointer to an
M_AsyncModel
that holds an async value to an initialized model. This async value is in a resolved state. You are reponsible for the memory associated with the pointer returned. You can deallocate the memory by callingM_freeModel()
. If model initialization fails, thestatus
parameter contains an error message.
-
M_AsyncModel *M_initModel(const M_RuntimeContext *context, const M_AsyncCompiledModel *compiledModel, M_Status *status)¶
Sets up a model for execution.
This function returns immediately with model initialization happening asynchronously.
You should call
M_compileModel()
before calling this.- Parameters:
context – The runtime context.
compiledModel – The compiled model.
status – The status used to report errors in the case of failures. The status contains an error only if the given context or compiled model is invalid. Other errors will not surface until the next synchronization point.
- Returns:
A pointer to an
M_AsyncModel
that holds an async value to a compiled model. You are reponsible for the memory associated with the pointer returned. You can deallocate the memory by callingM_freeModel()
. If model initialization fails, thestatus
parameter contains an error message.
-
M_TensorSpec *M_getModelInputSpecAt(const M_AsyncCompiledModel *model, size_t index, M_Status *status)¶
Gets the specifications for an input tensor.
- Parameters:
model – The compiled model.
index – The index of the input tensor.
status – The status used to report errors in the case of failures. The status contains an error only if the given model or index is invalid.
- Returns:
A pointer to an
M_TensorSpec
, or aNULL
pointer if the model or index is invalid. IfNULL
, thestatus
parameter contains an error message.
-
void M_waitForModel(M_AsyncModel *model, M_Status *status)¶
Blocks execution until the model is setup.
- Parameters:
model – The model.
status – The status used to report errors in the case of failures.
-
M_AsyncTensorArray *M_executeModelSync(const M_RuntimeContext *context, M_AsyncModel *initializedModel, M_AsyncTensorArray *inputs, M_Status *status)¶
Executes a model.
The inputs and outputs are
M_AsyncTensorArray
objects to allow chaining of inference. This operation is blocking and waits until the output results are ready.- Parameters:
context – The runtime context.
initializedModel – The model to execute.
inputs – The tensor inputs.
status – The status used to report errors in the case of failures. The status will contain an error only if the context or model is invalid. Other errors will not surface until the next synchronization point.
- Returns:
A pointer to an
M_AsyncTensorArray
that holds the output tensors. These tensors are in a resolved state. You are responsible for the memory associated with the pointer returned. You can deallocate the memory by callingM_freeAsyncTensorArray()
. In the case that executing the model fails, thestatus
parameter contains an error message.
-
M_AsyncTensorArray *M_executeModel(const M_RuntimeContext *context, M_AsyncModel *initializedModel, M_AsyncTensorArray *inputs, M_Status *status)¶
Executes a model asynchronously with given inputs.
The inputs and outputs are
M_AsyncTensorArray
to allow chaining of inference. This operation is non-blocking (i.e. async) and does not wait for the output results to be ready.- Parameters:
context – The runtime context.
initializedModel – The model to execute.
inputs – The tensor inputs.
status – The status used to report errors in the case of failures. The status will only contain an error if the context or model are invalid. Other errors will not surface until the next synchronization point.
- Returns:
A pointer to a
M_AsyncTensorArray
that holds the output tensors. Callers are responsible for the memory associated with the pointer returned. The memory can be deallocated by callingM_freeAsyncTensorArray()
.
-
void M_waitForTensors(M_AsyncTensorArray *outputs, M_Status *status)¶
Waits for the outputs to be ready.
- Parameters:
outputs – The output tensors from executing a model.
status – The status used to report errors in the case of failures.
-
size_t M_getNumModelInputs(const M_AsyncCompiledModel *model, M_Status *status)¶
Gets the number of inputs for the model.
If the model is not yet resolved/ready, this function blocks execution.
You should call
M_compileModel()
before calling this.- Parameters:
model – The compiled model.
status – The status used to report errors in the case of failures.
- Returns:
The number of inputs for the model, or
0
if there is an error in getting the model metadata. If0
, thestatus
parameter contains an error message.
-
size_t M_getNumModelOutputs(const M_AsyncCompiledModel *model, M_Status *status)¶
Gets the number of outputs for the model.
If the model is not yet resolved/ready, this function blocks execution.
You should call
M_compileModel()
before calling this.- Parameters:
model – The compiled model.
status – The status used to report errors in the case of failures.
- Returns:
The number of outputs for the model, or
0
if there is an error in getting the model metadata. If0
, thestatus
parameter contains an error message.
-
void M_freeModel(M_AsyncModel *model)¶
Deallocates the memory for the model.
- Parameters:
model – The model to deallocate.
-
void M_freeCompiledModel(M_AsyncCompiledModel *model)¶
Deallocates the memory for the compiled model.
- Parameters:
model – The compiled model to deallocate.
-
void M_freeCompileConfig(M_CompileConfig *config)¶
Deallocates the memory for the compile config.
- Parameters:
config – The compilation configuration to deallocate.