When a request spans 5 microservices and something is slow, distributed tracing tells you exactly where the latency is. Learn OpenTelemetry auto-instrumentation with Spring Boot 3.
JOptimize Team
In a monolith, a slow request shows up in a profiler — you find the slow method and fix it. In a microservices architecture, a slow request might span an API gateway, an auth service, a product service, two DB calls, and a Redis lookup. Without distributed tracing, you get a 500ms response time and no idea which service caused it. With tracing, you see the full call tree and the exact milliseconds spent in each service.
Spring Boot 3 includes Micrometer Tracing, which wraps OpenTelemetry and provides a vendor-neutral API:
<!-- Micrometer Tracing with OpenTelemetry bridge --> <dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-tracing-bridge-otel</artifactId> </dependency> <!-- Export to Zipkin (or swap for Tempo, Jaeger) --> <dependency> <groupId>io.opentelemetry</groupId> <artifactId>opentelemetry-exporter-zipkin</artifactId> </dependency> <!-- Auto-instrument Spring MVC, WebClient, RestClient, Kafka --> <dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-tracing-reporter-wavefront</artifactId> <scope>compile</scope> <optional>true</optional> </dependency>
# application.properties management.tracing.sampling.probability=1.0 # 100% in dev, 0.1 in prod management.zipkin.tracing.endpoint=http://zipkin:9411/api/v2/spans # Correlate logs with traces logging.pattern.level=%5p [${spring.application.name:},%X{traceId:-},%X{spanId:-}]
With these dependencies, Spring Boot automatically instruments:
WebClient / RestClient calls (propagates trace context)@KafkaListener methods (continues trace from message headers)JdbcTemplate queries (creates DB spans)Trace: GET /api/orders/dashboard [total: 347ms] ├── http GET /api/orders/dashboard [12ms] — API Gateway │ ├── auth-service: validateToken [8ms] — Auth Service │ └── order-service: getDashboard [327ms] — Order Service ← SLOW │ ├── jdbc: SELECT * FROM orders [180ms] ← BOTTLENECK │ ├── redis: GET user:42:prefs [2ms] │ └── http GET /inventory/bulk [145ms] — Inventory Service │ └── jdbc: SELECT * FROM stock [140ms]
One glance tells you: the slow JDBC query in order-service is the problem. Without tracing, you'd spend hours adding logs and reproducing.
@Service @RequiredArgsConstructor public class ReportService { private final Tracer tracer; public Report generateReport(Long userId) { // Create a custom span for the expensive operation Span span = tracer.nextSpan() .name("report.generate") .tag("userId", userId.toString()) .tag("reportType", "monthly") .start(); try (Tracer.SpanInScope ws = tracer.withSpan(span)) { Report report = expensiveReportGeneration(userId); span.tag("report.rows", String.valueOf(report.getRowCount())); return report; } catch (Exception e) { span.error(e); throw e; } finally { span.end(); } } }
Custom spans appear in the trace tree with your tags — you can filter by userId or reportType in Zipkin/Grafana Tempo.
Spring Kafka 3.x + Micrometer Tracing automatically propagates trace context in message headers:
// Producer — trace context injected into message headers automatically @Service public class OrderProducer { private final KafkaTemplate<String, OrderEvent> kafkaTemplate; public void publishOrderCreated(OrderEvent event) { kafkaTemplate.send("orders", event.getOrderId().toString(), event); // Micrometer injects: traceparent, tracestate headers } } // Consumer — trace resumed from headers automatically @KafkaListener(topics = "orders") public void onOrder(OrderEvent event) { // This method runs in the same trace as the producer // Kafka consumer span appears as child of the producer span processOrder(event); }
The full trace shows: HTTP request → Kafka produce → Kafka consume → DB write — one continuous trace across async boundaries.
# Dev: trace everything management.tracing.sampling.probability=1.0 # Prod: sample 10% of requests management.tracing.sampling.probability=0.1
For production, always-on tracing adds 2-5% overhead. With 10% sampling, overhead drops to ~0.5%. For debugging specific slow requests, use head-based sampling with a higher rate temporarily, or tail-based sampling (trace all requests, export only slow ones) if your collector supports it.
# Structured logging with trace/span IDs logging.pattern.console=%d{HH:mm:ss} %highlight(%-5level) [%blue(%X{traceId})/%blue(%X{spanId})] %logger{36} - %msg%n
With this pattern, every log line includes the trace and span IDs:
14:32:01 ERROR [4bf92f3577b34da6/00f067aa0ba902b7] o.s.w.s.OrderService - Failed to load order 42
Now you can:
traceIdsampling.probability=1.0 in production — 100% sampling on a high-traffic service adds noticeable overhead and generates enormous trace data; use 5-10% in prodAuthorization header to downstream services — the trace spans the services but security context doesn't; pass the JWT explicitlybaggage — Micrometer Tracing Baggage propagates arbitrary key-value pairs across service boundaries; use it for userId, tenantId instead of repeating them as tags on every spanDistributed tracing with Spring Boot 3 + Micrometer Tracing + OpenTelemetry is largely automatic: HTTP calls, DB queries, and Kafka messages are instrumented out of the box. Add custom spans for domain-significant operations, correlate with logs via MDC trace IDs, and tune sampling to 5-10% in production. The result is full visibility into cross-service latency — you find the bottleneck by reading the trace, not by adding more logs.
JOptimize's live profiling complements distributed tracing by showing you slowest methods at the JVM level — where tracing stops, profiling begins.
See every millisecond of your app's execution — free scan.
Master Spring Boot, security, and Java performance with hands-on courses.
JOptimize finds N+1 queries, EAGER collections, and 70+ other issues in your Java codebase — in under 30 seconds.