Open-source systems monitoring and alerting toolkit that collects and stores time-series metrics via a pull-based model, purpose-built for reliability and cloud-native, microservices-based architectures.
Home Page
Key Features
- Multi-dimensional data model with metrics identified by name and key/value label pairs
- PromQL flexible query language for aggregation, filtering, and analysis
- Pull-based metric collection over HTTP with configurable scrape intervals
- Built-in Alertmanager for alert routing, deduplication, grouping, and notification (Slack, PagerDuty, email)
- Automatic service discovery for Kubernetes, Consul, EC2, and more
- Autonomous single-server nodes with local time-series database (no distributed storage dependency)
- Push support via Pushgateway for short-lived jobs
- 100s of community exporters (Node, Blackbox, MySQL, Redis, etc.)
- Native Kubernetes and container-level monitoring
- Pairs with Grafana as the de facto visualization layer
Background
- Originally built at SoundCloud in 2012; second CNCF graduated project (after Kubernetes).
- De facto standard for Kubernetes and cloud-native metrics monitoring.
- Standalone, reliability-first design — works even when the rest of your infrastructure is down.
- Commonly deployed as part of the Prometheus + Grafana + Alertmanager stack.