Overview
Galileo is an AI observability and eval engineering platform that transforms offline evaluations into production guardrails. It enables users to capture ground truth by building datasets from synthetic, development, and live production data, incorporating subject matter expert annotations. The platform helps create accurate evaluations by auto-tuning metrics from live feedback, optimizing them for specific environments. Users can distill optimized evaluations into Luna models, enabling monitoring of 100% of traffic at a reduced cost. Galileo supports rapid debugging by analyzing agent behavior, identifying failure modes, and prescribing fixes, accelerating AI deployments and enhancing the reliability of AI systems. It offers out-of-box evals for RAG, agents, safety, and security, and supports custom evaluators.
Common tasks