Reputation: 11
totally new to the whole Observability ecosystem and I'm trying to migrate a given project from sleuth to Micrometer Tracing. Let's say that i'm currently using a simple approach of rest call -> opentelemetry + spring cloud gateway based on webflux-> service A and finally get the tracing output in Kibana. Both gateway and service A share a custom common-lib where a build.gradle is defined as below.
project(':common-tracing') {
dependencies {
api platform(group: 'io.micrometer', name: 'micrometer-tracing-bom', version: "${micrometerVersion}")
runtimeOnly("io.opentelemetry.instrumentation:opentelemetry-logback-mdc-1.0:2.0.0-alpha")
api(project(':common-properties'))
api(group: 'io.micrometer', name: 'micrometer-tracing')
api(group: 'io.micrometer', name: 'micrometer-tracing-bridge-otel')
api(group: 'io.micrometer', name: 'micrometer-registry-prometheus')
implementation(group: 'org.springframework', name: 'spring-context', version: '6.0.11')
}
}
I've added the logback-mdc dependency as I'm trying to manually instrument some parts through MDC but this wasn't the case pre-migration.
In both gateway's and service A's app.yml there is :
management:
tracing:
enabled: true
sampling:
probability: 1.0
For service A further changes consisted only of :
logging.pattern.level: "%5p [${spring.zipkin.service.name:${spring.application.name:-}},%X{user:-},%X{traceId:-},%X{spanId:-},%X{X-Span-Export:-}]"
For gateway, as a tracer, OpenTelemetry is used and the build.gradle consists of:
implementation(group: 'io.opentelemetry', name: 'opentelemetry-api')
implementation(group: 'io.opentelemetry', name: 'opentelemetry-sdk')
implementation(group: 'io.opentelemetry', name: 'opentelemetry-exporter-logging')
implementation(group: 'io.opentelemetry', name: 'opentelemetry-semconv', version: '1.30.1-alpha')
There are two Filters implementing GlobalFilter applied through gateway.
TraceContextGlobalFilter.class:
@Override
public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
return chain.filter(exchange).then(Mono.fromRunnable(() -> processTraceHeaders(exchange)));
}
processTraceHeaders function:
ServerHttpRequest request = exchange.getRequest();
if (isExternalRequest(request)) {
Context context =
openTelemetry
.getPropagators()
.getTextMapPropagator()
.extract(Context.current(), request, serverHttpRequestGetter);
log.info("before span");
Span span =
tracer
.spanBuilder("test-span")
.setParent(context)
.setSpanKind(SpanKind.SERVER)
.startSpan();
try {
try (Scope scope = span.makeCurrent()) {
span.setAttribute(stringKey("http.method"), request.getMethod().name());
span.setAttribute(stringKey("http.url"), request.getURI().toASCIIString());
MDC.put("spanId", span.getSpanContext().getSpanId()); //this was added post-migration
MDC.put("traceId", span.getSpanContext().getTraceId()); //this was added post-migration
openTelemetry
.getPropagators()
.getTextMapPropagator()
.inject(Context.current(), exchange.getResponse(), serverHttpRequestSetter);
}
} finally {
span.end();
MDC.remove("traceId"); //this was added post-migration
MDC.remove("spanId"); //this was added post-migration
}
}
TraceContextLoggingFilter.class :
@Override
public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
return chain.filter(exchange).then(Mono.fromRunnable(() -> processTraceHeaderLog(exchange)));
}
private void processTraceHeaderLog(ServerWebExchange exchange) {
try {
String traceParent = exchange.getResponse().getHeaders().getFirst(TRACE_PARENT_HEADER);
if (StringUtils.isNotBlank(traceParent)) {
log.info("External Trace Info: {}", traceParent);
}
} catch (Exception e) {
log.warn("Failed to log external request trace header", e);
}
}
Logging pattern for the gateway is defined as below:
logging.pattern.level: "%5p [${spring.zipkin.service.name:${spring.application.name:-}},%X{user:-},%X{traceId:-},%X{spanId:-},%X{X-Span-Export:-}]"
The problem is that post-migration I can see in Kibana through my gateway pod:
External Trace Info: 00-6ee45b5142af8c5dfbc0e85fd7393331-f14ef56bac8894c7-01
But no traceId or spanId. Even if I log.info(Span.current().getSpanContext().getTraceId()) alongside the External trace one, the output is just 00000000000000000000000000000000
However, in my serviceA pod I can get both traceId and spanId. So it seems like the traceId is getting initialized in the tracing context but it's not printed alongside the "External Trace Info" log.info.
What I've tried: I could get manual instrumentation going by adding in TraceContextGlobalFilter:
MDC.put("spanId", span.getSpanContext().getSpanId());
MDC.put("traceId", span.getSpanContext().getTraceId());
And changing logging.pattern to :
logging.pattern.level: "traceId=%mdc{traceId} spanId=%mdc{spanId} %5p"
Then I could get both traceId and spanId alongside the "External Trace Info" log.
Thing is that pre-migration this was done through automated instrumentation and not through MDC. Moreover the traceId was not derived from traceparent i.e
6ee45b5142af8c5dfbc0e85fd7393331
but it was created adhoc. This is very specific to this code part and traces are not shown for example in error handling like it did before.
Upvotes: 1
Views: 1243
Reputation: 11
Answering my own question, since spring-cloud-gateway is webflux based, context propagation between threads is needed through ReactorContext. In order to do that you have to add to your main.app the below:
Hooks.enableAutomaticContextPropagation();
However, previous sleuth implementation used to create a new TraceID and not derive it from the traceparent. Since in our application we could have external requests with the same traceparent from viewclients, by having the same TraceID in two different queries this would confuse the debugging process. So I need to figure out something for that now
Upvotes: 0