Blog Post··8 min read
How Decoder-Only Transformers Evolved Since GPT-2
GPT-2 established the decoder-only transformer as the dominant paradigm. What followed was six years of systematic improvements — in scale, efficiency, alignment, and reasoning. Here's the arc.