Python class
TransformerBlock
TransformerBlock
class max.nn.TransformerBlock(attention, mlp, attention_norm, mlp_norm, residual_multiplier=1.0)
Bases: Module
Stack of Attention, FeedForward, and RMSNorm layers.
Was this page helpful?
Thank you! We'll create more content like this.
Thank you for helping us improve!