当前位置 : 主页 > 操作系统 > centos >

PrometheusAlert+pushgateway接入实战

来源:互联网 收集:自由互联 发布时间:2022-06-20
说明 基本测试环境部署,详见:Prometheus+Alertmanager+PrometheusAlert+飞书容器化部署实战 关于飞书机器人使用,详见:自定义机器人指南 关于 PrometheusAlet 自定义模板详见:自定义告警消息

说明

基本测试环境部署,详见:Prometheus+Alertmanager+PrometheusAlert+飞书容器化部署实战

关于飞书机器人使用,详见:自定义机器人指南

关于 PrometheusAlet 自定义模板详见:自定义告警消息模版使用说明

关于 PrometheusAlert 部署详见:Kubernetes中部署PrometheusAlert并使用mysql作后端存储,如果只是测试,可以直接使用sqlite3

PrometheusAlert 自定义模板

使用该功能需要使用者对go语言的template模版有一些初步了解,可以参考默认模版的一些语法来进行自定义。

模版数据等信息均存储在程序目录的下的db/PrometheusAlertDB.db中。

获取飞书机器人webhook地址

使用飞书创建自己的通知机器人,详见:PrometheusAlert飞书配置

验证PrometheusAlert发送信息

打开飞书告警通道,更改 PrometheusAlert yaml中配置文件:

#是否开启飞书告警通道,可同时开始多个通道0为关闭,1为开启 open-feishu=1 #默认飞书机器人地址 fsurl=https://open.feishu.cn/open-apis/bot/v2/hook/1234-xxxx-xxxx-xxxx # 重新apply kubectl delete -f PrometheusAlert-Deployment.yaml kubectl apply -f PrometheusAlert-Deployment.yaml

浏览器访问:http://your-IP:30036/test

PrometheusAlert+pushgateway接入实战

这时,我们就可以在飞书上看到测试信息了,如下:

PrometheusAlert+pushgateway接入实战

自定义数据推送pushgateway

我们这里为了测试,简单定义几条数据,并将其推送至pushgateway,方便我们验证自定义模板功能。

# 定义数据文件 # cat pgdata.txt # TYPE http_request_total counter # HELP http_request_total get interface request count with different code. cat_bad{cat_bad="1900",interface="/v1/save"} 1900 cmd_original_dire_num{cmd_original_dire_num="200",interface="/v1/delete"} 200 # TYPE http_request_time gauge # HELP http_request_time get core interface http request time. # title mark{cat_bad="1900",cmd_original_dire_num="200"} 1

更多的 pushgateway 使用,详见基于Prometheus的Pushgateway实战

自定义模板

浏览器访问 PrometheusAlert 模板:http://your-IP:30036/template,点击“添加模板”,新建一个名为ai-01-fs的模板,点击“保存模板”,如下:

PrometheusAlert+pushgateway接入实战

更新 Alertmanager 配置

更改你的Alertmanager的配置,将所有告警信息都转发到PrometheusAlert自定义接口,参考如下:

# cat /data/test/alertmanager/etc/alertmanager.yml global: resolve_timeout: 2m smtp_from: '319981932@qq.com' smtp_smarthost: 'smtp.qq.com:465' smtp_auth_username: '319981932@qq.com' # 注意这里需要配置QQ邮箱的授权码,不是登录密码,授权码在账户配置中查看 smtp_auth_password: 'abcdefghijklmmop' smtp_require_tls: false route: #group_by: ['alert_node'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'prometheusalert' receivers: - name: 'prometheusalert' webhook_configs: - url: 'http://your-ip:30036/prometheusalert?type=fs&tpl=ai-01-fs&fsurl=https://open.feishu.cn/open-apis/bot/v2/hook/1234-xxxx-xxxx-xxxx' # 重启 alertmanager 服务 docker restart alertmanager # 查看 alertmanager 服务日志 docker logs alertmanager

更新 Prometheus 规则文件,添加 pushgateway 报警规则,如下:

# cat /data/test/prometheus/rules/alert-node-rules.yml # 添加如下: - alert: test expr: mark{ job="pushgateway"} == 1 for: 1m labels: test: mv annotations: description: "this is a test02"

重启 Prometheus 容器:

docker restart prometheus

验证自定义数据报警

上面我们已经完成了基本告警条件准备工作,下面我们通过客户端 POST数据至pushgateway,prometheus拉取pushgateway里数据,经过alertmanager报警规则触发,到prometheusalert自定义ai-01-fs模板,最后飞书机器人发送该报警信息,流程如下:

PrometheusAlert+pushgateway接入实战

客户端发送数据

# 定义的 job_name: 'pushgateway' curl -XPOST --data-binary @pgdata.txt http://your-IP:30037/metrics/job/pushgateway

查看pushgateway

浏览器访问:http://your-IP:30037/

PrometheusAlert+pushgateway接入实战

如果想删除该监控指标,点击页面右上角“Delete Group”,或命令行:

# 删除所有指标: curl -X DELETE http://your-ip:30037/metrics/job/pushgateway

查看Prometheus

浏览器访问:http://your-IP:9090/

PrometheusAlert+pushgateway接入实战

查看AlertManager

浏览器访问:http://your-IP:9093/#/alerts

PrometheusAlert+pushgateway接入实战

查看飞书报警信息

PrometheusAlert+pushgateway接入实战

关于自定义模板变量

PrometheusAlert上也有说,我这里跟大家再简单说明下。

进入 PrometheusAlert 容器中查看收到的日志消息,当我们在客户端 POST 数据时,动态打印 prometheusalert logs:

kubectl logs prometheus-alert-center-6966747c-b48f7 -n kube-mon -f

可以拿到如下数据:

2021/11/05 03:35:34.364 [D] [value.go:476] [1636083334364061744] {"receiver":"prometheusalert","status":"resolved","alerts":[{"status":"resolved","labels":{"alertname":"test","cat_bad":"1900","cmd_original_dire_num":"200","exported_job":"pushgateway","instance":"your-IP:30037","job":"pushgateway","test":"mv"},"annotations":{"description":"this is a test02"},"startsAt":"2021-11-05T03:34:39.341Z","endsAt":"2021-11-05T03:35:24.341Z","generatorURL":"http://xxxxx:9090/graph?g0.expr=mark%7Bjob%3D%22pushgateway%22%7D+%3D%3D+1\u0026g0.tab=1","fingerprint":"xxxxxxx"}],"groupLabels":{},"commonLabels":{"alertname":"test","cat_bad":"1900","cmd_original_dire_num":"200","exported_job":"pushgateway","instance":"yourIP:30037","job":"pushgateway","test":"mv"},"commonAnnotations":{"description":"this is a test02"},"externalURL":"http://xxxxx:9093","version":"4","groupKey":"{}:{}","truncatedAlerts":0}

继续截取日志中的JSON内容,通过任意json格式化工具进行格式化如下:

{ "receiver": "prometheusalert", "status": "resolved", "alerts": [{ "status": "resolved", "labels": { "alertname": "test", "cat_bad": "1900", "cmd_original_dire_num": "200", "exported_job": "pushgateway", "instance": "your-IP:30037", "job": "pushgateway", "test": "mv" }, "annotations": { "description": "this is a test02" }, "startsAt": "2021-11-05T03:34:39.341Z", "endsAt": "2021-11-05T03:35:24.341Z", "generatorURL": "http://xxxxx:9090/graph?g0.expr=mark%7Bjob%3D%22pushgateway%22%7D+%3D%3D+1\u0026g0.tab=1", "fingerprint": "xxxxxxx" }], "groupLabels": {}, "commonLabels": { "alertname": "test", "cat_bad": "1900", "cmd_original_dire_num": "200", "exported_job": "pushgateway", "instance": "yourIP:30037", "job": "pushgateway", "test": "mv" }, "commonAnnotations": { "description": "this is a test02" }, "externalURL": "http://xxxxx:9093", "version": "4", "groupKey": "{}:{}", "truncatedAlerts": 0 }

然后对照该JSON开始编写模版,并在Dashboard上进行添加,示例模版如下:

ES_1_标准英语 数据分类比对完成!,Json分类情况=详见日志信息 {{ range $k,$v:=.alerts }} ###### Json分类情况:{{$v.labels.cat_bad}} ###### 原始目录个数:{{$v.labels.cmd_original_dire_num}} {{ end }}

上面的JSON内容也可以粘贴在Dashboard上“消息协议JSON内容“,用来模拟测试。添加完自定义模板后,主要一定要点击保存。

Alertmanager结合PrometheusAlert模板做基于标签路由分发

我们创建两个机器人,将其加入不同的群聊中。

更新Prometheus匹配规则

# cat /data/test/prometheus/rules/alert-node-rules.yml - alert: test01 expr: mark{ job="pushgateway"} == 1 for: 1m labels: test: ceshi-01 annotations: description: "this is a test01" - alert: test02 expr: ttmark{ job="pushgateway"} == 1 for: 1m labels: test: ceshi-02 annotations: description: "this is a test02" # 重启prometheus服务 docker restart prometheus # 查看prometheus服务日志 docker logs prometheus

更新AlertManager路由

global: resolve_timeout: 2m smtp_from: '319981932@qq.com' smtp_smarthost: 'smtp.qq.com:465' smtp_auth_username: '319981932@qq.com' # 注意这里需要配置QQ邮箱的授权码,不是登录密码,授权码在账户配置中查看 smtp_auth_password: 'abcdefghijklmmop' smtp_require_tls: false route: #group_by: ['alert_node'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'default' routes: - matchers: - cmd_original_dire_num="200" receiver: 'prometheusalert-02' - matchers: - cmd_original_dire_num="300" receiver: 'prometheusalert-01' receivers: - name: 'default' wechat_configs: - corp_id: 'your-id' to_user: '1234567' agent_id: your-id # 注意是int api_secret: 'your-secret' send_resolved: true - name: 'prometheusalert-01' webhook_configs: - url: 'http://your-IP:30036/prometheusalert?type=fs&tpl=ai-01-fs&fsurl=https://open.feishu.cn/open-apis/bot/v2/hook/xxxx-xxxx' - name: 'prometheusalert-02' webhook_configs: - url: 'http://your-IP:30036/prometheusalert?type=fs&tpl=ai-01-fs&fsurl=https://open.feishu.cn/open-apis/bot/v2/hook/yyyy-yyyy' # 重启alertmanager服务 docker restart alertmanager # 查看alertmanager服务日志 docker logs alertmanager

客户端发送数据

# cat pgdata.txt # TYPE http_request_total counter # HELP http_request_total get interface request count with different code. cat_bad{cat_bad="1900",interface="/v1/save"} 1900 cmd_original_dire_num{cmd_original_dire_num="200",interface="/v1/delete"} 200 # TYPE http_request_time gauge # HELP http_request_time get core interface http request time. # title mark{cat_bad="1900",cmd_original_dire_num="200"} 1 # cat test.txt # TYPE http_request_total counter # HELP http_request_total get interface request count with different code. cat_bad{cat_bad="1800",interface="/v1/save"} 1900 cmd_original_dire_num{cmd_original_dire_num="100",interface="/v1/delete"} 200 # TYPE http_request_time gauge # HELP http_request_time get core interface http request time. # title ttmark{cat_bad="1800",cmd_original_dire_num="300"} 1 # post data curl -XPOST --data-binary @pgdata.txt http://your-IP:30037/metrics/job/pushgateway curl -XPOST --data-binary @test.txt http://your-IP:30037/metrics/job/pushgateway

验证

浏览器打开pushgateway:http://your-IP:30037/

PrometheusAlert+pushgateway接入实战

飞书报警信息:

PrometheusAlert+pushgateway接入实战

PrometheusAlert+pushgateway接入实战

【文章原创作者:ddos攻击防御 http://www.558idc.com/aqt.html欢迎留下您的宝贵建议】
上一篇:性能监控之JMX监控docker中的java应用
下一篇:没有了
网友评论