Overview
LitmusChaos is a CNCF Graduated project providing an end-to-end framework for cloud-native chaos engineering. Its technical architecture is built on a Kubernetes-native design, utilizing Custom Resource Definitions (CRDs) to manage chaos experiments as declarative code. By 2026, LitmusChaos has solidified its position as the industry standard for platform teams transitioning from reactive monitoring to proactive resilience. It enables SREs to orchestrate complex failure scenarios—ranging from pod kills and network latency to cloud-provider API failures—integrated directly into CI/CD pipelines. The platform features ChaosCenter, a unified control plane for multi-tenant experiment management, and ChaosHub, a public repository of pre-built experiments. Its architecture supports GitOps workflows, allowing teams to version control their resilience tests alongside application code. The 2026 market landscape sees LitmusChaos as the primary open-source alternative to proprietary solutions like Gremlin, favored for its deep integration with the Prometheus/Grafana stack and its ability to run entirely within air-gapped or highly regulated environments.
