Input data schema
The following YAML schema allows you to specify the input shapes required by
the models you pass to the benchmark
and
visualize
commands, using the --input-data-schema
option.
Specifying the input shapes is necessary when your model does not include the
shape metadata, which is required to generate inputs and compile the model.
When benchmarking your model, you can also use the schema to specify your own input data or override default behaviors, such as the randomization strategy for input data.
The purpose of the input data schema is to specify input shapes and the input data used for benchmark scenarios—it is not a complete specification of the model’s supported inputs.
inputs
The top-level property that includes a list of all inputs to generate.
Other than the identifier for each input (input_name
), nothing else is
required if the model provides metadata about the input shape for each input.
For example, if your model identifies each input tensor by name, and the model includes metadata to specify the input shape and data type for each input tensor, then you can simply declare each input name like this:
inputs:
- input_name: image
- input_name: text
As a result, the benchmark tool generates random input data for both the “image” and “text” inputs, and no other inputs. The benchmark tool does not generate data for inputs that are unspecified in the YAML file. However, if you do not provide a YAML file at all, it generates data for all inputs that are specified in the model metadata (if available). Thus, if your model includes metadata for all inputs, using this YAML file is not required but it allows you to restrict or modify the data generated for your benchmark scenario.
If the model does not include metadata about the input shapes, then each listed
input must also specify the shape
.
input_name
Specifies an input based on name.
Example:
inputs:
- input_name: image
shape
Specifies the shape and data type for an input.
This must be combined with input_name
.
Example:
inputs:
- input_name: input
shape: 1x10x50xf32
In this example, the shape is 1x10x50xf32
, which declares the input shape is
3-dimensional: the first dimension (the batch size) is 1
, the second
dimension is 10
, the third is 50
, and the data type is f32
(32-bit
float).
If you want to generate input data for benchmarking, you must specify all dimensions with a fixed size, including the batch size. You’ll receive an error if there are any dimensions not fully specified, because the tool cannot generate inputs for non-concrete shapes. If your model supports dynamic shapes on any dimensions, that’s fine, but you must specify a static size to generate benchmarking inputs.
If you will provide your own input data with the data.file
property,
then it's okay to leave input shapes as dynamic (such as ?x?x50xf32
), because
the benchmark tool does not need to generate any input data.
Currently, you cannot run a single benchmark that uses multiple shapes for consecutive inferences. If your model supports dynamic input shapes and you want to use a different shapes for each inference, then you must use different input data schemas, one for each scenario you want to benchmark.
Data types
You can specify the shape's data type with the following notations:
bool
: boolean.si8
: signed integer with bitwidth 8.ui8
: unsigned integer with bitwidth 8.si16
: signed integer with bitwidth 16.ui16
: unsigned integer with bitwidth 16.si32
: signed integer with bitwidth 32.ui32
: unsigned integer with bitwidth 32.si64
: signed integer with bitwidth 64.ui64
: unsigned integer with bitwidth 64.f16
: IEEE754-2008 binary16 floating-point value.f32
: IEEE754-2008 binary32 floating-point value.f64
: IEEE754-2008 binary64 floating-point value.
data
Specifies the type of data to use for the current input
in your benchmark.
The data may be random
(you specify the randomization strategy), constant
(you specify a single value), or file
(you specify the inputs with a file).
This element is always optional and the default is random
.
This must be combined with input_name
.
The available data options are the following:
random
: All values are randomly generated (this is the default). Currently, we support only uniform distribution:uniform
: Uses a uniform distribution.You can specify the minimum and maximum value with nested elements:
min
: The minimum value (default is-1.0
for floats, or the absolute minimum for integer types).max
: The maximum value (default is1.0
for floats, or the absolute maximum for integer types).
constant
: Uses the given value for all inputs.file
: Uses inputs you provide in a file. (Currently NumPy files only.)path
: The path to your.npy
file. The data must include a “samples” dimension in the first (zeroth) axis. That is, the file can include multiple input samples to use in your benchmark, but if you want to provide only one sample, then you must include the samples dimension anyway. For example, callnp.expand_dims(input_sample, axis=0)
.
Example:
inputs:
- input_name: input1
shape: 1x10x50xf32
data:
random:
uniform:
min: 0
- input_name: input2
shape: 1x10x50xf32
data:
constant: 1.1
- input_name: input3
data:
file:
path: path/to/my/numpy_data.npy
compile_shape
We put this one last because it addresses a special situation that most people won't encounter.
Technically, compile_shape
must always be set, but we default to using the
shape given with shape
. However, you might want to change compile_shape
to
something different when benchmarking a model that accepts a variable sequence
length. In which case, you should define compile_shape
to indicate which
dimensions are variable (dynamic).
Adding compile_shape
with a dynamic shape definition ensures that the MAX
compiler does not incorrectly assume that inputs will always be static
shapes—in which case the compiler might apply optimizations to improve
performance based on that static shape. Thus, compile_shape
forces the
compiler to not over-optimize for a particular shape, resulting in more
honest benchmark results.
For example, let's say your model accepts a batch of variable-length sequences,
and you expect most queries to be 32 tokens long. So, you specify a static
shape of 1x32xf32
to use in your benchmarks. To ensure that the MAX compiler
does not optimize upon this fixed shape, you can then specify compile_shape
as follows:
inputs:
- input_name: 1
shape: 1x32xf32
compile_shape: ?x?xf32
Each question mark indicates a dynamic shape. This way, the MAX compiler knows
the first and second dimension are truly unknown and it must generalize for
different input shapes. Whereas, for benchmark inferencing purposes, we'll use
data that's generated based on the static shape
value.