Blog Post··10 min read
Cheatsheet: LLM Forward Pass Equations
The full forward pass, written out as equations, for GPT-2, Qwen3-8B, DeepSeek-V3, and GPT-OSS. Every matrix, every norm, every residual — in the order the computation actually happens.
The full forward pass, written out as equations, for GPT-2, Qwen3-8B, DeepSeek-V3, and GPT-OSS. Every matrix, every norm, every residual — in the order the computation actually happens.