Blog Post··6 min read
Information Theory for LLMs: Mutual Information, Entropy, and What Models Learn
Information theory gives precise answers to questions like: how much does the context tell you about the next token? What information is preserved in a representation? Why does compression and prediction point to the same objective?