OpenAI Evals API
The Evals API allows you to programmatically configure and run evaluations to test model outputs against your expectations. Evaluations ensure model responses meet style and content criteria you specify, and are essential for building reliable LLM applications, especially when upgrading or trying new models.