Skip to main content

Mojo module

device_query

Provides device query utilities for communication primitives.

comptime values

allreduce_table

comptime allreduce_table = Table(List(VariadicList(TuningConfigAllreduce(-1, -1, StringSlice("sm_90a"), 216), TuningConfigAllreduce(4, 134217728, StringSlice("sm_90a"), 232), TuningConfigAllreduce(-1, -1, StringSlice("sm_100a"), 512), TuningConfigAllreduce(2, 8388608, StringSlice("sm_100a"), 512), TuningConfigAllreduce(2, 16777216, StringSlice("sm_100a"), 512), TuningConfigAllreduce(2, 33554432, StringSlice("sm_100a"), 512), TuningConfigAllreduce(2, 67108864, StringSlice("sm_100a"), 512), TuningConfigAllreduce(2, 134217728, StringSlice("sm_100a"), 512), TuningConfigAllreduce(4, 8388608, StringSlice("sm_100a"), 512), TuningConfigAllreduce(4, 16777216, StringSlice("sm_100a"), 512), TuningConfigAllreduce(4, 33554432, StringSlice("sm_100a"), 512), TuningConfigAllreduce(4, 67108864, StringSlice("sm_100a"), 512), TuningConfigAllreduce(4, 134217728, StringSlice("sm_100a"), 512), TuningConfigAllreduce(-1, -1, StringSlice("CDNA3"), 32), TuningConfigAllreduce(-1, -1, StringSlice("CDNA4"), 64), TuningConfigAllreduce(8, 1048576, StringSlice("CDNA4"), 64), TuningConfigAllreduce(8, 2147483648, StringSlice("CDNA4"), 44)), Tuple()), String("allreduce_table"))

Structs

  • TuningConfigAllreduce: Parameters: ngpus: Number of GPUs for running allreduce. num_bytes: Total number of input bytes supported by the config. sm_version: SM version (as string). num_blocks: Number of thread blocks for running allreduce.

Functions

Was this page helpful?