research-project · placeholder
Reliable LLM Evaluation Methods
Placeholder for author-reviewed research on rubrics, judges, benchmark design, and reproducible evaluation methodology.
evaluation, rubrics, benchmarks
research-project · placeholder
Placeholder for author-reviewed research on rubrics, judges, benchmark design, and reproducible evaluation methodology.
evaluation, rubrics, benchmarks
research-project · placeholder
Placeholder for work on model specifications, constitutions, behavioral audits, and alignment-relevant post-training analysis.
model-behavior, model-specs, alignment
initiative · placeholder
Selected project names migrated from the old portfolio for review: Labor Room Register, Outbreak Responder, BeeHyv.com, Munchbot, and MayAIHelp.
archive, software