Setup & Installation
What This Skill Does
Analyzes LLM experiment results from Datadog, supporting single or comparative experiments in exploratory or Q&A modes. Given one or two experiment IDs, it pulls metrics, segments failures, samples representative events, and produces a structured report with root-cause hypotheses and actionable recommendations. Instead of manually querying experiment summaries, cross-referencing metrics by segment, and sampling failure events one by one, this skill runs the full analysis pipeline automatically and delivers a report with specific numbers and linked examples.
When to use it
- Working with dd llmo experiment analyzer functionality
- Implementing dd llmo experiment analyzer features
- Debugging dd llmo experiment analyzer related issues
