Literal AI
Literal AI is the collaborative observability, evaluation, and analytics platform for building production-grade LLM applications, from the Chainlit team. Its API is GraphQL (POST /api/graphql) consumed through Python and TypeScript SDKs, capturing threads, steps, generations, datasets, experiments, prompts, and scores, with an additional OpenTelemetry (OTLP) ingestion path for traces.
APIs
Literal AI Threads & Steps API
Create, read, update, upsert, and delete conversation threads and the nested steps (runs, tools, retrievals, LLM calls) that trace an LLM application's execution, queried and mu...
Literal AI Generations API
Log and paginate chat and completion generations - prompts, model, settings, token usage, and outputs - with filtering for analytics and evaluation.
Literal AI Datasets API
Build and manage datasets and dataset items - curated from production steps or created manually - that serve as ground truth for evaluation and experiments.
Literal AI Experiments API
Create dataset experiments and record per-item experiment runs with scores to benchmark prompt, model, and pipeline changes over time.
Literal AI Prompts API
Version, store, and retrieve prompt templates with their model settings, enabling collaborative prompt engineering and A/B testing against deployed apps.
Literal AI Scores API
Attach human, AI, and code-based scores to steps and generations - numeric, categorical, or boolean - for feedback collection and offline/online evaluation.