feat(docker): PP-04 可观测性 MVP — Alertmanager 告警出口 + Grafana provisioning

PP-04 核实属实:11 条告警规则在 prometheus 加载但无 alertmanager(告警
无通知出口),grafana provisioning 目录空,exporter 服务也未部署
("配置齐全运行为零")。

MVP 打通告警链路 + 让 grafana 可用(不依赖 exporter,基于 app metrics):
- docker-compose.production.yml 加 alertmanager 服务 + alertmanager_data 卷
- prometheus.yml 加 alerting 指向 alertmanager:9093
- alertmanager/config.yml 路由(SEV-1 critical 即时通知 + 分组)
- grafana/provisioning/datasources 自动连 prometheus
- grafana/provisioning/dashboards provider 就绪

待办(上线前):① alertmanager 占位 webhook 替换为真实渠道(钉钉/企微/邮件)
② 补 grafana dashboard JSON ③ 部署 postgres/redis/nginx exporter 让 prometheus 抓得到
This commit is contained in:
iven
2026-06-26 09:25:43 +08:00
parent 3351c68d10
commit 6457c53d9c
5 changed files with 86 additions and 0 deletions

View File

@@ -134,6 +134,23 @@ services:
networks:
- hms-internal
# ── Alertmanager 告警通知出口 ──
# PP-04: 之前 11 条告警规则在 prometheus 加载但无 alertmanager告警触发无人知晓
alertmanager:
image: prom/alertmanager:v0.27.0
container_name: hms-alertmanager
restart: unless-stopped
volumes:
- ./alertmanager/config.yml:/etc/alertmanager/config.yml:ro
- alertmanager_data:/alertmanager
command:
- "--config.file=/etc/alertmanager/config.yml"
- "--storage.path=/alertmanager"
expose:
- "9093"
networks:
- hms-internal
# ── Grafana 可视化 ──
grafana:
image: grafana/grafana:11.4.0
@@ -167,6 +184,8 @@ volumes:
driver: local
grafana_data:
driver: local
alertmanager_data:
driver: local
networks:
hms-internal: