Research

research-project · placeholder

Reliable LLM Evaluation Methods

Placeholder for author-reviewed research on rubrics, judges, benchmark design, and reproducible evaluation methodology.

evaluation, rubrics, benchmarks

research-project · placeholder

Placeholder for work on model specifications, constitutions, behavioral audits, and alignment-relevant post-training analysis.

model-behavior, model-specs, alignment

initiative · placeholder

Selected project names migrated from the old portfolio for review: Labor Room Register, Outbreak Responder, BeeHyv.com, Munchbot, and MayAIHelp.

archive, software