Deepchecks
Deepchecks is an ML and LLM testing, evaluation, and monitoring platform. Its cloud LLM Evaluation product exposes a REST API for logging LLM interactions, managing applications and versions, retrieving annotations, and configuring evaluation properties, while its open-source Python packages provide continuous validation of tabular, computer-vision, and LLM data and models.
APIs
Deepchecks LLM Interactions / Logging API
Logs raw LLM interactions (input, output, context, history, and metadata) to a specific application version for evaluation, retrieves enriched interactions by filter, downloads ...
Deepchecks Applications / Versions API
Creates and lists evaluation applications and their versions, the organizational units that scope interactions, properties, and evaluation runs in the Deepchecks LLM Evaluation ...
Deepchecks Annotations API
Reads automatic and manual annotations (good / bad / unknown labels and reasons) attached to logged interactions, returned alongside interactions when downloading enriched evalu...
Deepchecks Properties API
Configures LLM property definitions and display names per application, governing the built-in and custom properties (relevance, grounded-in-context, toxicity, PII, and more) tha...
Deepchecks Open-Source Testing
The AGPL-3.0 licensed open-source Python package for continuous validation of tabular, computer-vision, and LLM/NLP data and models. Distributed via PyPI (pip install deepchecks...