【云原生 • Prometheus】Prometheus 注册中心Eureka服务发现原理

Prometheus 注册中心Eureka服务发现原理

概述

Eureka服务发现协议允许使用Eureka Rest API检索出Prometheus需要监控的targets,Prometheus会定时周期性的从Eureka调用Eureka Rest API,并将每个应用实例创建出一个target。

Eureka服务发现协议支持对如下元标签进行relabeling

  • __meta_eureka_app_name: the name of the app
  • __meta_eureka_app_instance_id: the ID of the app instance
  • __meta_eureka_app_instance_hostname: the hostname of the instance
  • __meta_eureka_app_instance_homepage_url: the homepage url of the app instance
  • __meta_eureka_app_instance_statuspage_url: the status page url of the app instance
  • __meta_eureka_app_instance_healthcheck_url: the health check url of the app instance
  • __meta_eureka_app_instance_ip_addr: the IP address of the app instance
  • __meta_eureka_app_instance_vip_address: the VIP address of the app instance
  • __meta_eureka_app_instance_secure_vip_address: the secure VIP address of the app instance
  • __meta_eureka_app_instance_status: the status of the app instance
  • __meta_eureka_app_instance_port: the port of the app instance
  • __meta_eureka_app_instance_port_enabled: the port enabled of the app instance
  • __meta_eureka_app_instance_secure_port: the secure port address of the app instance
  • __meta_eureka_app_instance_secure_port_enabled: the secure port of the app instance
  • __meta_eureka_app_instance_country_id: the country ID of the app instance
  • __meta_eureka_app_instance_metadata_<metadataname>: app instance metadata
  • __meta_eureka_app_instance_datacenterinfo_name: the datacenter name of the app instance
  • __meta_eureka_app_instance_datacenterinfo_<metadataname>: the datacenter metadata

eureka_sd_configs常见配置如下:

代码语言:javascript
复制
- job_name: 'eureka'
  eureka_sd_configs:
    - server: http://localhost:8761/eureka #eureka server地址
      refresh_interval: 1m #刷新间隔,默认30s

eureka_sd_configs官网支持主要配置如下:

代码语言:javascript
复制
server: <string>

basic_auth:
[ username: <string> ]
[ password: <secret> ]
[ password_file: <string> ]

Configures the scrape request's TLS settings.

tls_config:
[ <tls_config> ]

Optional proxy URL.

[ proxy_url: <string> ]

Configure whether HTTP requests follow HTTP 3xx redirects.

[ follow_redirects: <bool> | default = true ]

Refresh interval to re-read the app instance list.

[ refresh_interval: <duration> | default = 30s ]

Eureka协议实现

基于Eureka服务发现协议核心逻辑都封装在discovery/eureka.gofunc (d *Discovery) refresh(ctx context.Context) ([]*targetgroup.Group, error)方法中:

代码语言:javascript
复制
func (d *Discovery) refresh(ctx context.Context) ([]*targetgroup.Group, error) {
// 通过Eureka REST API接口从eureka拉取元数据:http://ip:port/eureka/apps
apps, err := fetchApps(ctx, d.server, d.client)
if err != nil {
return nil, err
}

tg := &targetgroup.Group{
Source: "eureka",
}

for , app := range apps.Applications {//遍历app
// targetsForApp()方法将app下每个instance部分转成target
targets := targetsForApp(&app)
//解析的采集点合入一起
tg.Targets = append(tg.Targets, targets...)
}
return []*targetgroup.Group{tg}, nil
}

refresh方法主要有两个流程:

1、fetchApps():从eureka-server/eureka/apps接口拉取注册服务信息;

2、targetsForApp():遍历appinstance,将每个instance解析出一个target,并添加一堆元标签数据。

如下示例从eureka-server/eureka/apps接口拉取的注册服务信息:

代码语言:javascript
复制
<applications>
<versions__delta>1</versions__delta>
<apps__hashcode>UP_1
</apps__hashcode>
<application>
<name>SERVICE-PROVIDER-01</name>
<instance>
<instanceId>localhost:service-provider-01:8001</instanceId>
<hostName>192.168.3.121</hostName>
<app>SERVICE-PROVIDER-01</app>
<ipAddr>192.168.3.121</ipAddr>
<status>UP</status>
<overriddenstatus>UNKNOWN</overriddenstatus>
<port enabled="true">8001</port>
<securePort enabled="false">443</securePort>
<countryId>1</countryId>
<dataCenterInfo class="com.netflix.appinfo.InstanceInfo$DefaultDataCenterInfo">
<name>MyOwn</name>
</dataCenterInfo>
<leaseInfo>
<renewalIntervalInSecs>30</renewalIntervalInSecs>
<durationInSecs>90</durationInSecs>
<registrationTimestamp>1629385562130</registrationTimestamp>
<lastRenewalTimestamp>1629385682050</lastRenewalTimestamp>
<evictionTimestamp>0</evictionTimestamp>
<serviceUpTimestamp>1629385562132</serviceUpTimestamp>
</leaseInfo>
<metadata>
<management.port>8001</management.port>
<scrape__enable>true</scrape__enable>
<scrape.port>8080</scrape.port>
</metadata>
<homePageUrl>http://192.168.3.121/</homePageUrl>
<statusPageUrl>http://192.168.3.121/actuator/info</statusPageUrl>
<healthCheckUrl>http://192.168.3.121/actuator/health</healthCheckUrl>
<vipAddress>service-provider-01</vipAddress>
<secureVipAddress>service-provider-01</secureVipAddress>
<isCoordinatingDiscoveryServer>false</isCoordinatingDiscoveryServer>
<lastUpdatedTimestamp>1629385562132</lastUpdatedTimestamp>
<lastDirtyTimestamp>1629385562039</lastDirtyTimestamp>
<actionType>ADDED</actionType>
</instance>
</application>
</applications>

instance信息会被解析成采集点target

代码语言:javascript
复制
func targetsForApp(app *Application) []model.LabelSet {
targets := make([]model.LabelSet, 0, len(app.Instances))

// Gather info about the app's 'instances'. Each instance is considered a task.
for _, t := range app.Instances {
var targetAddress string
// __address__取值方式:instance.hostname和port,没有port则默认port=80
if t.Port != nil {
targetAddress = net.JoinHostPort(t.HostName, strconv.Itoa(t.Port.Port))
} else {
targetAddress = net.JoinHostPort(t.HostName, "80")
}

target := model.LabelSet{
model.AddressLabel: lv(targetAddress),
model.InstanceLabel: lv(t.InstanceID),

appNameLabel: lv(app.Name),
appInstanceHostNameLabel: lv(t.HostName),
appInstanceHomePageURLLabel: lv(t.HomePageURL),
appInstanceStatusPageURLLabel: lv(t.StatusPageURL),
appInstanceHealthCheckURLLabel: lv(t.HealthCheckURL),
appInstanceIPAddrLabel: lv(t.IPAddr),
appInstanceVipAddressLabel: lv(t.VipAddress),
appInstanceSecureVipAddressLabel: lv(t.SecureVipAddress),
appInstanceStatusLabel: lv(t.Status),
appInstanceCountryIDLabel: lv(strconv.Itoa(t.CountryID)),
appInstanceIDLabel: lv(t.InstanceID),
}

if t.Port != nil {
target[appInstancePortLabel] = lv(strconv.Itoa(t.Port.Port))
target[appInstancePortEnabledLabel] = lv(strconv.FormatBool(t.Port.Enabled))
}

if t.SecurePort != nil {
target[appInstanceSecurePortLabel] = lv(strconv.Itoa(t.SecurePort.Port))
target[appInstanceSecurePortEnabledLabel] = lv(strconv.FormatBool(t.SecurePort.Enabled))
}

if t.DataCenterInfo != nil {
target[appInstanceDataCenterInfoNameLabel] = lv(t.DataCenterInfo.Name)

if t.DataCenterInfo.Metadata != nil {
for _, m := range t.DataCenterInfo.Metadata.Items {
ln := strutil.SanitizeLabelName(m.XMLName.Local)
target[model.LabelName(appInstanceDataCenterInfoMetadataPrefix+ln)] = lv(m.Content)
}
}
}

if t.Metadata != nil {
for , m := range t.Metadata.Items {
// prometheus label只支持[^a-zA-Z0-9
]字符,其它非法字符都会被替换成下划线_
ln := strutil.SanitizeLabelName(m.XMLName.Local)
target[model.LabelName(appInstanceMetadataPrefix+ln)] = lv(m.Content)
}
}

targets = append(targets, target)

}
return targets
}

解析比较简单,就不再分析,解析后的标签数据如下图:

标签中有两个特别说明下:

1、address:这个取值instance.hostnameport(默认80),所以要注意注册到eureka上的hostname准确性,不然可能无法抓取;

2、metadata-map数据会被转成meta_eureka_app_instance_metadata<metadataname>格式标签,prometheus进行relabeling 一般操作metadata-map,可以自定义metric_path、抓取端口等;

3、prometheuslabel只支持[a-zA-Z0-9],其它非法字符都会被转换成下划线,具体参加:strutil.SanitizeLabelName(m.XMLName.Local);但是eurekametadata-map标签含有下划线时,注册到eureka-server上变成双下划线,如下配置:

代码语言:javascript
复制
eureka:
instance:
metadata-map:
scrape_enable: true
scrape.port: 8080

通过/eureka/apps获取如下:

总结

基于Eureka服务发现原理如下图:

基于eureka_sd_configs服务发现协议配置创建Discoverer,并通过协程运行Discoverer.Run方法,Eureka服务发现核心逻辑封装discovery/eureka.gofunc (d *Discovery) refresh(ctx context.Context) ([]*targetgroup.Group, error)方法中。

refresh方法中主要调用两个方法:

1、fetchApps:定时周期从Eureka Server/eureka/apps接口拉取注册上来的服务元数据信息;

2、targetsForApp:解析上步骤拉取的元数据信息,遍历app下的instance,将每个instance解析成target,并将其它元数据信息转换成target元标签可以用于relabel_configs操作