Tag: sft

Blog Post·2024-06-19·8 min read

The LLM Alignment Pipeline: SFT, Reward Models, and RL End to End

Training a helpful, harmless, honest LLM requires three sequential stages that each build on the previous one. Here's how SFT, reward modeling, and RL fit together as a system — and where each stage can fail.

alignment rlhf sft reward-model llm-training

Blog Post·2024-06-19·5 min read

Why Language Models Need Reinforcement Learning

Supervised fine-tuning teaches a model to imitate. Reinforcement learning teaches it to optimize. The difference turns out to matter enormously.

rl llm-training sft

Blog Post·2024-06-19·7 min read

Synthetic Data for Alignment: Curation, Quality Filtering, and Self-Critique

Human annotation doesn't scale to the data volumes modern alignment requires. Synthetic data — generated by LLMs, filtered, and refined — has become the dominant approach. Here's how it's done and where it breaks down.

alignment synthetic-data sft data-curation llm-training