Streaming multiprocessor
A streaming multiprocessor (SM) is the fundamental processing unit of a GPU, designed to execute multiple parallel workloads efficiently. Each SM contains several cores, which perform the actual computations of the threads executing on the SM, along with shared resources like registers, shared memory, and control mechanisms to coordinate the execution of threads.
The number of SMs and the number of cores on a GPU depends on its architecture. For example, the NVIDIA H100 PCIe contains 114 SMs, with 128 32-bit floating point cores per SM.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!