feat(docker): PP-04 完善 — Grafana HMS 概览 dashboard + postgres/redis exporter + 渠道文档

延续 PP-04 MVP,补全可观测性闭环:
- grafana/provisioning/dashboards/json/hms-overview.json: HMS 概览 dashboard
  (服务状态/DB 连接池/EventBus 积压/内存 CPU/API 5xx 错误率,基于 app metrics)
- postgres-exporter + redis-exporter 服务: 之前 prometheus.yml 配了 target 但
  服务未部署(pg_stat_activity/redis_memory 等告警永不触发),现补齐
- alertmanager 启用 --config.expand-env: 支持渠道 token 用 \${VAR} 从 .env 注入
  (避免重蹈 PP-03 Redis 密码明文入 git 覆辙)
- alertmanager/README.md: 钉钉/企微/邮件渠道配置文档(上线前填)

nginx-exporter 跳过(alerts.yml 无 nginx 规则 + 需改 nginx.conf 配 stub_status)
This commit is contained in:
iven
2026-06-26 10:03:21 +08:00
parent 6457c53d9c
commit ffbe5a797f
4 changed files with 204 additions and 0 deletions

View File

@@ -145,6 +145,7 @@ services:
- alertmanager_data:/alertmanager
command:
- "--config.file=/etc/alertmanager/config.yml"
- "--config.expand-env=true"
- "--storage.path=/alertmanager"
expose:
- "9093"
@@ -171,6 +172,30 @@ services:
networks:
- hms-internal
# ── Prometheus exportersPP-04之前 prometheus.yml 配了 target 但服务未部署,告警永不触发)──
postgres-exporter:
image: prometheuscommunity/postgres-exporter:v0.15.0
container_name: hms-postgres-exporter
restart: unless-stopped
environment:
DATA_SOURCE_NAME: "postgresql://${POSTGRES_USER:-erp}:${POSTGRES_PASSWORD}@postgres:${POSTGRES_PORT:-5432}/${POSTGRES_DB:-erp}?sslmode=disable"
expose:
- "9187"
networks:
- hms-internal
redis-exporter:
image: oliver006/redis_exporter:v1.66.0
container_name: hms-redis-exporter
restart: unless-stopped
environment:
REDIS_ADDR: "redis://redis:${REDIS_PORT:-6379}"
REDIS_PASSWORD: "${REDIS_PASSWORD:-erp_redis_dev}"
expose:
- "9121"
networks:
- hms-internal
volumes:
app-uploads:
driver: local