Context
Built open-evals as an MIT-licensed platform for eval rows, failure attribution, calibration, prompt optimization, and release decisions. It imports traces as evidence for evals without turning into a generic observability product.
SCOPE
- Deterministic evals that work without AI credentials
- Trace import adapters for common eval and observability formats
- Prompt release, optimization, A/B/n assignment, and privacy-aware artifact handling