prometheus 结合 consul 的服务发现.
动机
prometheus 本身有很多种服务发现机制,恰巧最近在部署 prometheus 项目,相当于从零开始搭的一套,平时都是用的 kubernetes 里面的 operater 版本,现在重新搞的这一套是非 kubernetes 部署的,今天我们就针对 prometheus 和 consul 做一个简单的记录.
consul 部署
我们在开发环境,就采用单节点部署,具体参考consul单节点部署
prometheus 部署
具体参考prometheus部署
服务注册
consul 部署完以后,我们就可以把服务注册到里面。
服务注册
curl -X PUT -d '{"name": "node-exporter","id": "node-exporter-192.168.2.201","address": "192.168.2.201","port": 59100,"tags": ["node-exporter"],"Meta": {"env": "dev"},"checks": [{"http": "http://192.168.2.201:59100/metrics", "interval": "5s"}]}' http://your_consul_ip:8500/v1/agent/service/register
取消注册
curl -XPUT http://your_consul_ip:8500/v1/agent/service/deregister/node-exporter-192.168.2.201
服务发现配置
然后 prometheus 从 consul 获取需要监控的服务,我们拿 node-exporter 为例。
prometheus.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
rule_files:
- rules/*.yml
scrape_configs:
... # 其他job
- job_name: "consul-node-exporter"
consul_sd_configs:
- server: '192.168.2.201:8500'
services: []
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*node-exporter.*
action: keep
- source_labels: ['__meta_consul_service_id']
regex: '(.*)'
target_label: 'node'
replacement: '$1'
- source_labels: ['__meta_consul_service_address']
regex: '(.*)'
target_label: 'instance'
replacement: '$1'
- source_labels: ['__meta_consul_service_metadata_env']
regex: '(.*)'
target_label: 'env'
replacement: '$1'
|
效果图
总结
总体来说部署并不难,后续的机器服务注册需要结合 ansible 或其他自动化脚本来实现注册,不然一台台搞得累死。