Tag: representation-learning

Blog Post·2025-06-20·11 min read

Representation Geometry: How Neural Networks Encode Meaning

The linear representation hypothesis, superposition, polysemanticity, and why transformer activations are more structured than they look.

mechanistic-interpretability representation-learning superposition linear-representation-hypothesis transformers

Blog Post·2024-06-19·6 min read

Information Theory for LLMs: Mutual Information, Entropy, and What Models Learn

Information theory gives precise answers to questions like: how much does the context tell you about the next token? What information is preserved in a representation? Why does compression and prediction point to the same objective?

math information-theory entropy representation-learning

Blog Post·2024-06-19·5 min read

JEPA: Predicting in Representation Space

MAE predicts pixels. Contrastive methods match views. JEPA predicts representations of target regions from context regions — in an abstract space where irrelevant details have already been discarded.

ssl jepa representation-learning world-models

Blog Post·2024-06-19·5 min read

Why Self-Supervised Learning? The Label Bottleneck

Supervised learning requires labels. Labels require humans. At scale, that's the bottleneck. Self-supervised learning sidesteps it by constructing supervision from the data itself.

ssl self-supervised representation-learning