Documentation
Complete guide to using Conforma AI - LLM Benchmark & Analytics Platform
Quick Start
Get started with Conforma AI in 5 minutes
Configure LLM Providers
Navigate to Providers and add your API keys for OpenRouter, Anthropic, OpenAI, or Google.
Create Benchmark Tasks
Go to Tasks and create evaluation tasks with input/output pairs or import from CSV.
Run Benchmarks
Navigate to Benchmarks → New, select tasks and models, then execute.
Analyze Results
View detailed results in Results with charts, metrics, and model comparisons.
Pro Tip
Key Concepts
Tasks
Evaluation scenarios with input prompts and expected outputs. Tasks can be reused across multiple benchmarks.
Benchmarks
Collections of tasks executed against selected LLM models. Results are compared using similarity metrics.
Providers
LLM API integrations (OpenRouter, Anthropic, OpenAI, Google). Configure once, use across all benchmarks.
Results
Detailed metrics including similarity scores, response times, token usage, and cost analysis.