S. Roy

Blog Post·2024-06-19·4 min read

V-JEPA: Predicting the Future in Representation Space

V-JEPA extends JEPA to video: predict the representations of future or masked frames from context frames. No pixel reconstruction, no contrastive loss — just abstract prediction across time.

ssl jepa v-jepa video world-models

Tag: v-jepa

V-JEPA: Predicting the Future in Representation Space