Prometheus基于Consul服务发现

一、环境

主机名 IP地址 系统 说明
localhost 192.168.224.11 Centos7.6 docker安装的prometheus
server2.com 192.168.224.12 Centos7.6

二、基于 Consul 的服务发现

Consul 是由 HashiCorp 开发的一个支持多数据中心的分布式服务发现和键值对存储服务的开源软件,是一个通用的服务发现和注册中心工具,被大量应用于基于微服务的软件架构当中。

我们通过api将exporter服务注册到 Consul,然后配置 Prometheus 从 Consul 中发现实例。关于 Consul 本身的使用可以查看官方文档https://learn.hashicorp.com/consul了解更多。

1、二进制安装配置 Consul(二选一)

在页面https://www.consul.io/downloads下载符合自己系统的安装文件,比如我们这里是 Linux 系统,使用下面命令下载安装即可:

1
2
3
4
5
6
7
8
wget https://releases.hashicorp.com/consul/1.14.5/consul_1.14.5_linux_amd64.zip

yum install unzip -y

unzip consul_1.14.5_linux_amd64.zip

mv consul /usr/local/bin
consul version

启动consul

为了查看更多的日志信息,我们可以在 dev 模式下运行 Consul,如下所示:

1
consul agent -dev -client 0.0.0.0

启动命令后面使用 -client 参数指定了客户端绑定的 IP 地址,默认为 127.0.0.1

2、docker安装Consul二选一)

docker运行

1
docker run -d --name consul -p 8500:8500 consul:1.14.5

检查

1
docker ps

3、consul http访问地址

1
http://192.168.224.11:8500/ui/dc1/services

4、通过api注册到Consul

使用命令行注册

1
curl -X PUT -d '{"id": "node1","name": "node_exporter","address": "node_exporter","port": 9100,"tags": ["exporter"],"meta": {"job": "node_exporter","instance": "Prometheus服务器"},"checks": [{"http": "http://192.168.224.11:9100/metrics", "interval": "5s"}]}'  http://localhost:8500/v1/agent/service/register

把json数据放在文件中,使用这个json文件注册

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
mkdir /data/consul
cd /data/consul

cat > node_exporter.json<<"EOF"
{
"id": "node2",
"name": "node_exporter",
"address": "192.168.224.12",
"port": 9100,
"tags": ["exporter"],
"meta": {
"job": "node_exporter",
"instance": "server2.com服务器"
},
"checks": [{
"http": "http://192.168.224.12:9100/metrics",
"interval": "10s"
}]
}
EOF

使用json文件注册

1
curl --request PUT --data @node_exporter.json http://localhost:8500/v1/agent/service/register

除了我们注册的 2 个 demo 服务之外,Consul agent 还会将自己注册为一个名为 consul 的服务,我们可以在浏览器中访问 http://192.168.224.11:8500 查看注册的服务。

p9NgrS1.png

在 Consul UI 页面中可以看到有 consulnode_exporter 两个 Service 服务。

5、配置 Prometheus

上面我们通过 Consul 注册了 2 个 node_exporter 服务,接下来我们将配置 Prometheus 通过 Consul 来自动发现 node_exporter服务。

在 Prometheus 的配置文件 prometheus.yml 文件中的 scrape_configs 部分添加如下所示的抓取配置:

备份源文件

1
2
3
cd /data/docker-prometheus
cp -a prometheus/prometheus.yml prometheus/prometheus.yml.bak
ls -l prometheus/prometheus.yml.bak

使用cat去掉之前的配置,使用下面的配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
cat > prometheus/prometheus.yml<<"EOF"     
# 全局配置
global:
scrape_interval: 15s # 将搜刮间隔设置为每15秒一次。默认是每1分钟一次。
evaluation_interval: 15s # 每15秒评估一次规则。默认是每1分钟一次。

# Alertmanager 配置
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']

# 报警(触发器)配置
rule_files:
- "alert.yml"
- "rules/*.yml"

# 搜刮配置
scrape_configs:
- job_name: 'prometheus'
# 覆盖全局默认值,每15秒从该作业中刮取一次目标
scrape_interval: 15s
static_configs:
- targets: ['localhost:9090']
- job_name: 'alertmanager'
# 覆盖全局默认值,每15秒从该作业中刮取一次目标
scrape_interval: 15s
static_configs:
- targets: ['alertmanager:9093']

- job_name: 'consul_exporter'
consul_sd_configs:
- server: '192.168.224.11:8500'
services: []
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*exporter.*
action: keep
- regex: __meta_consul_service_metadata_(.+)
action: labelmap
# Spring Boot 2.x 应用数据采集配置
- job_name: 'consul_springboot_demo'
metrics_path: '/actuator/prometheus'
scrape_interval: 5s
consul_sd_configs:
- server: '192.168.224.11:8500'
services: []
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*springboot.*
action: keep
- regex: __meta_consul_service_metadata_(.+)
action: labelmap
#http配置
- job_name: "consul-blackbox_http"
metrics_path: /probe
params:
module: [http_2xx]
consul_sd_configs:
- server: '192.168.224.11:8500'
services: []
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*blackbox_http.*
action: keep
- regex: __meta_consul_service_metadata_(.+)
action: labelmap
- source_labels: [__meta_consul_service_address]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.224.12:9115
#tcp检查配置
- job_name: "consul_blackbox_tcp"
metrics_path: /probe
params:
module: [tcp_connect]
consul_sd_configs:
- server: '192.168.224.11:8500'
services: []
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*blackbox_tcp.*
action: keep
- regex: __meta_consul_service_metadata_(.+)
action: labelmap
- source_labels: [__meta_consul_service_address]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.224.12:9115

#icmp检查配置
- job_name: "consul_blackbox_icmp"
metrics_path: /probe
params:
module: [icmp]
consul_sd_configs:
- server: '192.168.224.11:8500'
services: []
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*blackbox_icmp.*
action: keep
- regex: __meta_consul_service_metadata_(.+)
action: labelmap
- source_labels: [__meta_consul_service_address]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.224.12:9115

#域名检测
- job_name: consul_domain_exporter
scrape_interval: 10s
metrics_path: /probe
consul_sd_configs:
- server: '192.168.224.11:8500'
services: []
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*domain.*
action: keep
- regex: __meta_consul_service_metadata_(.+)
action: labelmap
- source_labels: [__meta_consul_service_address]
target_label: __param_target
- target_label: __address__
replacement: 192.168.224.12:9222
EOF

通过 consul_sd_configs 配置用于自动发现的 Consul 服务地址,服务名为[],我们通过relabel_configs的过滤规则只接收指定的exporter

1
curl -X POST http://localhost:9090/-/reload

配置完成后重新启动 Prometheus,然后重新查看 Prometheus 页面上的 targets 页面,验证上面的配置是否存在:

1
http://192.168.224.11:9090/targets

p9NWCoF.png

正常情况下是可以看到会有一个 exporter 的任务,下面有 2 个自动发现的抓取目标。

6、创建添加脚本

使用预先准备好的脚本,一次添加多个targets:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
cat >/data/consul/api.sh <<"EOF"
#nginx
curl -X PUT -d '{"id": "nginx1","name": "nginx_exporter","address": "192.168.224.12","port": 9113,"tags": ["exporter"],"meta": {"job": "nginx_exporter","instance": "server2.com服务器","env":"server2.com"},"checks": [{"http": "http://192.168.224.12:9113/metrics", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register

#rabbitmq
curl -X PUT -d '{"id": "rabbitmq1","name": "rabbitmq_exporter","address": "192.168.224.12","port": 9419,"tags": ["exporter"],"meta": {"job": "rabbitmq_exporter","instance": "server2.com服务器","env":"server2.com"},"checks": [{"http": "http://192.168.224.12:9419/metrics", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register

#redis
curl -X PUT -d '{"id": "redis1","name": "redis_exporter","address": "192.168.224.12","port": 9121,"tags": ["exporter"],"meta": {"job": "redis_exporter","instance": "server2.com服务器","env":"server2.com"},"checks": [{"http": "http://192.168.224.12:9121/metrics", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register

#mongodb
curl -X PUT -d '{"id": "mongodb1","name": "mongodb_exporter","address": "192.168.224.12","port": 9216,"tags": ["exporter"],"meta": {"job": "mongodb_exporter","instance": "server2.com服务器","env":"server2.com"},"checks": [{"http": "http://192.168.224.12:9216/metrics", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register

#mysql
curl -X PUT -d '{"id": "mysql1","name": "mysqld_exporter","address": "192.168.224.12","port": 9104,"tags": ["exporter"],"meta": {"job": "mysqld_exporter","instance": "server2.com服务器","env":"server2.com"},"checks": [{"http": "http://192.168.224.12:9104/metrics", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register

#cadvisor
curl -X PUT -d '{"id": "cadvisor1","name": "cadvisor","address": "cadvisor","port": 8080,"tags": ["exporter"],"meta": {"job": "cadvisor","instance": "Prometheus服务器","env":"server2.com"},"checks": [{"http": "http://192.168.224.11:8080/metrics", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register
curl -X PUT -d '{"id": "cadvisor2","name": "cadvisor","address": "192.168.224.12","port": 8080,"tags": ["exporter"],"meta": {"job": "cadvisor","instance": "server2.com服务器","env":"server2.com"},"checks": [{"http": "http://192.168.224.12:8080/metrics", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register

#springboot
curl -X PUT -d '{"id": "springboot1","name": "springboot","address": "192.168.224.12","port": 8081,"tags": ["springboot"],"meta": {"job": "springboot","instance": "server2.com服务器","env":"server2.com"},"checks": [{"http": "http://192.168.224.12:8081/actuator/prometheus", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register


#process_exporter
curl -X PUT -d '{"id": "process1","name": "process_exporter","address": "192.168.224.12","port": 9256,"tags": ["exporter"],"meta": {"job": "process_exporter","instance": "server2.com服务器","env":"server2.com"},"checks": [{"http": "http://192.168.224.12:9256/metrics", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register

#http
curl -X PUT -d '{"id": "http1","name": "blackbox_http","address": "https://www.jd.com","tags": ["blackbox_http"],"checks": [{"http": "http://192.168.224.12:9115", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register

#tcp
curl -X PUT -d '{"id": "tcp1","name": "blackbox_tcp","address": "192.168.224.11:9090","tags": ["blackbox_tcp"],"checks": [{"http": "http://192.168.224.12:9115", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register

#icmp
curl -X PUT -d '{"id": "icmp1","name": "blackbox_icmp","address": "192.168.224.12","tags": ["blackbox_icmp"],"checks": [{"http": "http://192.168.224.12:9115", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register


#domin
curl -X PUT -d '{"id": "domain1","name": "domain_exporter","address": "baidu.com","tags": ["domain"],"checks": [{"http": "http://192.168.224.12:9222", "interval": "5s"}]}' http://localhost:8500/v1/agent/service/register
EOF

执行脚本

1
sh /data/consul/api.sh

检查

1
http://192.168.224.11:9090/targets

7、consul删除服务

1
curl --request PUT http://127.0.0.1:8500/v1/agent/service/deregister/ID

8、问题

consul健康检查失败如下图:

p9NhkK1.png

原因:

是因为并没有把8080映射出来(下图显示),导致consul监控检查不通过,所以报错。

p9N4q00.png

解决

修改docker-compose.yaml文件把8080端口映射出来,就好了,如下图:

p9N5S1J.png

修改完成后,执行命令

1
docker-compose up -d

三、ConsulManager

官方gitee地址

1、ConsulManager需要依赖Consul,请先完成Consul的部署。(暂时最高支持Consul v1.14.5)(docs/Consul部署说明.md

2、使用docker-compose来部署ConsulManager

  • 下载:wget https://starsl.cn/static/img/docker-compose.yml(仓库根目录下docker-compose.yml

  • 编辑:

    1
    vim docker-compose.yml

    修改3个环境变量:

    **consul_token**:consul的登录token(如何获取?),当然也可以不获取token,这样consul使用无密码登录(不安全)。

    **consul_url**:consul的URL(http开头,/v1要保留)

    **admin_passwd**:登录ConsulManager Web的admin密码

  • 启动:docker-compose pull && docker-compose up -d

  • 访问:http://192.168.224.11:1026/,使用配置的变量 admin_passwd 登录

  • 安装使用中遇到问题,请参考:FAQ

评论


:D 一言句子获取中...

加载中,最新评论有1分钟缓存...