Test Set Editor

Create and edit JSONL test sets for benchmarking

JSONL Editor

Each line should be a valid JSON object with at least "id" and "content" fields. Optional fields: "expected_answer", "category", "difficulty", "metadata".

Format Example:

{"id": "1", "content": "What is 2+2?", "expected_answer": "4", "category": "math"}
{"id": "2", "content": "Explain AI", "expected_answer": "...", "difficulty": "hard"}

Quick Stats

Total Prompts

With Expected Answers