在分布式微服务架构中,一个外部请求往往需要经过多个服务的协同处理,如何快速定位问题、分析性能瓶颈成为了一大挑战。本文将详细介绍如何在包含Gateway、Auth、Business三个模块的Spring Cloud微服务系统中整合Spring Cloud Sleuth和Zipkin,实现全链路追踪,并通过Docker部署Zipkin服务,将追踪数据持久化到Elasticsearch中。
外部请求 → Gateway → Auth → Business
在所有微服务模块(Gateway/Auth/Business)的pom.xml
中添加:
xml<!-- Sleuth依赖 -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>
<!-- Zipkin客户端依赖 -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-sleuth-zipkin</artifactId>
</dependency>
在每个模块的application.yml
中添加公共配置:
yamlspring:
sleuth:
sampler:
# 采样率,生产环境建议0.1,开发环境可以设为1.0
probability: 1.0
# 设置传播方式(默认支持AWS、XRAY、W3C)
propagation:
type: W3C
zipkin:
# Zipkin服务器地址
base-url: http://zipkin-server:9411
# 使用HTTP方式上报(可选kafka/rabbit)
sender:
type: web
# 服务名称会作为端点服务名
application:
name: ${模块名称} # 如gateway-service, auth-service等
Gateway需要额外配置以传递追踪头信息:
yamlspring:
cloud:
gateway:
httpclient:
wiretap: true # 启用HTTP客户端追踪
webclient:
wiretap: true # 启用WebClient追踪
对于Auth服务,可能需要排除健康检查等端点的追踪:
java@Configuration
public class SleuthConfig {
@Bean
public Sampler defaultSampler() {
return new Sampler() {
@Override
public boolean isSampled(TraceContext traceContext) {
// 排除actuator健康检查
if (request.getRequestURI().contains("actuator/health")) {
return false;
}
return true;
}
};
}
}
yamlversion: '3.8'
services:
zipkin:
image: openzipkin/zipkin:2.23
container_name: zipkin
environment:
- STORAGE_TYPE=elasticsearch
- ES_HOSTS=elasticsearch:9200
- ES_INDEX=zipkin
- ES_DATE_SEPARATOR=-
- ES_INDEX_SHARDS=3
- ES_INDEX_REPLICAS=1
ports:
- "9411:9411"
depends_on:
- elasticsearch
networks:
- tracing-network
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.10.2
container_name: elasticsearch
environment:
- discovery.type=single-node
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
- xpack.security.enabled=false
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- es-data:/usr/share/elasticsearch/data
ports:
- "9200:9200"
networks:
- tracing-network
volumes:
es-data:
networks:
tracing-network:
driver: bridge
bashdocker-compose up -d
http://localhost:9411
http://localhost:9200/_cat/health?v
通过Gateway发起一个测试请求:
bashcurl -X GET http://gateway:8080/api/business/example
在各个服务的日志中应该能看到Sleuth增强的日志信息:
Gateway日志示例:
[gateway-service,5f8a5d3b2c1d0e9f,a1b2c3d4e5f6g7h8,true] 2023-05-20 10:00:00.123 INFO ... 收到请求 /api/business/example
Auth日志示例:
[auth-service,5f8a5d3b2c1d0e9f,b2c3d4e5f6g7h8i9,true] 2023-05-20 10:00:00.456 INFO ... 鉴权通过
Business日志示例:
[business-service,5f8a5d3b2c1d0e9f,c3d4e5f6g7h8i9j0,true] 2023-05-20 10:00:00.789 INFO ... 处理业务逻辑
http://localhost:9411
)5f8a5d3b2c1d0e9f
gateway-service → auth-service → business-service
生产环境建议调整采样率:
yamlspring:
sleuth:
sampler:
probability: 0.1 # 只采样10%的请求
或者自定义采样逻辑:
java@Bean
public Sampler customSampler() {
return new Sampler() {
@Override
public boolean isSampled(TraceContext traceContext) {
// 对重要接口全采样,其他采样10%
return traceContext.sampled() ||
request.getRequestURI().contains("/api/important");
}
};
}
可以在业务代码中添加自定义标签:
java@GetMapping("/example")
public String example() {
// 获取当前Span
Span span = tracer.currentSpan();
if (span != null) {
span.tag("user.id", "12345");
span.tag("business.type", "A001");
}
// 业务逻辑...
}
对于异步操作,需要手动传递追踪上下文:
java@Async
public CompletableFuture<Void> asyncOperation() {
// 获取当前TraceContext
TraceContext context = this.tracer.currentTraceContext().context();
return CompletableFuture.runAsync(() -> {
try (Scope ws = this.tracer.withSpanInScope(context)) {
// 异步操作逻辑
Span span = tracer.nextSpan().name("async-job").start();
try {
// 业务处理
} finally {
span.end();
}
}
});
}
在logback-spring.xml中配置,将TraceID输出到日志文件:
xml<pattern>
[%X{traceId:-},%X{spanId:-}] %d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n
</pattern>
spring.zipkin.base-url
配置通过本文的整合方案,我们成功在Spring Cloud微服务系统中实现了完整的分布式链路追踪能力。Sleuth+Zipkin的组合提供了从日志收集到可视化分析的全套解决方案,而Docker部署的Zipkin+ES则保证了系统的可靠性和可扩展性。
这套方案不仅能帮助开发者快速定位问题,还能为系统性能优化提供数据支持,是微服务架构中不可或缺的运维利器。随着系统规模扩大,后续可以考虑引入更强大的APM系统如SkyWalking,但当前方案已经能满足大多数中小型系统的需求。