Tag: responsible-ai

Blog Post·2024-06-19·5 min read

Privacy Leakage in LLMs: PII, Memorization, and Code Generation Risks

LLMs memorize training data. Under the right prompts, they reproduce it. Here's how memorization works, how to measure it, and the specific privacy risks in code generation models.

responsible-ai privacy memorization code-generation

Blog Post·2024-06-19·5 min read

Red-Teaming vs Automated Evals: Tradeoffs and When to Use Each

Human red-teaming finds attacks automated evals miss. Automated evals achieve scale humans can't. Here's how to combine them, and what each can and can't tell you.

responsible-ai red-teaming evaluation safety llm

Blog Post·2024-06-19·5 min read

Runtime Guardrails: Architecture Patterns for Production AI Safety

Training-time alignment is not enough. Production AI systems need runtime layers that detect, intercept, and respond to harmful inputs and outputs. Here's how to build them.

responsible-ai safety guardrails production llm

Blog Post·2024-06-19·6 min read

Designing Safety Benchmarks for LLMs: What Makes an Eval Good

Most safety benchmarks are gameable, distribution-shifted, or measure the wrong thing. Here's what separates a rigorous safety evaluation from a checkbox.

responsible-ai safety evaluation benchmarking