Confident AI

DeepEval

DeepEval is an open-source Python framework for evaluating LLM applications as unit tests. It ships with research-backed metrics including GEval, AnswerRelevancyMetric, FaithfulnessMetric, TaskCompletionMetric, and ConversationalGEval, and supports end-to-end and component-level testing, multi-turn conversations, and LLM tracing for agents.

Documentation GitHub

Documentation

📖

GettingStarted

https://deepeval.com/docs/getting-started

📖

Documentation

https://deepeval.com/docs/

Other Resources

🔗

SourceCode

https://github.com/confident-ai/deepeval

🔗

SDKs

https://pypi.org/project/deepeval/

🔗

APIsJSON

https://raw.githubusercontent.com/api-evangelist/confident-ai/refs/heads/main/apis.yml

aid: confident-ai:deepeval name: DeepEval tags: - Open Source - LLM Evaluation - Python - Testing Framework humanURL: https://deepeval.com/ properties: - url: https://deepeval.com/docs/getting-started type: GettingStarted - url: https://deepeval.com/docs/ type: Documentation - url: https://github.com/confident-ai/deepeval type: SourceCode - url: https://pypi.org/project/deepeval/ type: SDKs description: DeepEval is an open-source Python framework for evaluating LLM applications as unit tests. It ships with research-backed metrics including GEval, AnswerRelevancyMetric, FaithfulnessMetric, TaskCompletionMetric, and ConversationalGEval, and supports end-to-end and component-level testing, multi-turn conversations, and LLM tracing for agents.

DeepEval

Documentation

Other Resources

API entry from apis.yml