Danielo515
Danielo515

Reputation: 7051

How to group spans under specific traceId in openTelemetry node sdk?

I have a distributed system where certain operations happen asynchronously over the course of several messages being processed by different actors. Each message has a correlation id of the message that originated the whole flow, so they can be easily traced toghether. Given the idea behind of open telemetry is to provide a standard way to trace computations across different systems, this should be simple to achieve, but the reality is that I'm really struggling with it.

The first surprising problem I faced was how hard is to set a specific traceId in a span. There are no methods to set the traceId of a span, there are no options to create a span with specific traceId. You need to somehow set that in context, but it also require to provide a spanId, and new spans created within that context will inherit that traceId, but also that spanId. That lead me to the creation of spans whose parent span does not exist (because, just like the traceId, I was generating the spanId myself).

Another annoying problem was that not any ID is a valid traceId or span Id, so, in order to generate the same traceId based on a specific correlation id I had to hash the string and adjust the length to 32 in the case of trace id and 16 in the case of spanId. Nothing warns you about this, if you provide either an invalid traceId or span Id they will be silently ignored and new ones will be created.

Below is the code that I came up with to try to group spans based on a specific correlation Id:

import { api } from "@opentelemetry/sdk-node";

function deriveHexString(input: string, desiredLength: number) {
  const hash = crypto.createHash("sha256").update(input).digest("hex");
  return hash.slice(0, desiredLength);
}

/**
* Creates a new context where the trace ID is set based on the provided
* correlationId.
* If there is an active span, the details of it will be used
**/
function createContext(correlationId: string) {
  const activeSpan = api.trace.getActiveSpan();
  const traceId = api.isValidTraceId(correlationId)
    ? correlationId
    : deriveHexString(correlationId, 32);
  const newSpan = activeSpan || api.trace.getTracer("doh").startSpan("newSpan");
  const context = api.trace.setSpanContext(
    api.trace.setSpan(api.context.active(), newSpan),
    {
      ...newSpan.spanContext(),
      traceId: traceId,
    },
  );
  newSpan.end();
  return context;
}

export function withTraceContext<A, U extends unknown[]>(
  correlationId: string,
  fn: (...args: U) => A,
) {
  return (...args: U) => {
    const context = createContext(correlationId);
    return api.context.with(context, () => fn(...args));
  };
}

Please note that I already tried several permutations of the createContext function. For example, not creating any new span for the context and just using the traceId for that:

function createContext(correlationId: string) {
  const activeSpan = api.trace.getActiveSpan();
  const traceId = api.isValidTraceId(correlationId)
    ? correlationId
    : deriveHexString(correlationId, 32);
  const spanId = activeSpan?.spanContext().spanId || traceId.slice(0, 16);
  const context = api.trace.setSpanContext(api.context.active(), {
    spanId,
    traceId: traceId,
    traceFlags: 1,
  });
  return context;
}

Both of this lead to several spans not being properly nested (for some reason I can not understand), but the worst, is that the spans that do belong to a trace where many of the spans reference a parent span that does not exist, which is a serious problem. What is the correct way of achieve this? Maybe the problem is that the new span that I'm generating is not being sent?

Upvotes: 1

Views: 714

Answers (1)

Andy Caruso
Andy Caruso

Reputation: 46

Generally, you don't need to manually generate the trace or span IDs for OpenTelemetry, and usually you don't need to handle the Propagation of those IDs.

If you are using a framework alongside your code, you should review this list and see if one meets your needs.

For your correlationId, you could use the setValue function to set an entry on the context which can then be used to associate across spans.

I would set up the function like this to make a new context for each call.


import { api } from "@opentelemetry/sdk-node";

const CORRELATION_ID = api.createContextKey('correlation-id');

/**
* Creates a new context where the trace ID is set based on the provided
* correlationId.
* If there is an active span, the details of it will be used
**/
function createContext(correlationId: string) {
  return api.context.active().setValue(CORRELATION_ID, correlationId);
}

export function withTraceContext<A, U extends unknown[]>(
  correlationId: string,
  fn: (...args: U) => A,
) {
  return (...args: U) => {
    const context = createContext(correlationId);
    return api.context.with(context, () => fn(...args));
  };
}

If you want to make a new span and context for each call you could do this:


import { api } from "@opentelemetry/sdk-node";

const CORRELATION_ID = api.createContextKey('correlation-id');

/**
* Creates a new context and span based on the provided
* correlationId.
**/
function createContextAndSpan(correlationId: string) {
  const newContext = api.context.active().setValue(CORRELATION_ID, correlationId);
  const newSpan = api.trace.getTracer("doh").startSpan("newSpan", {},
 newContext);
  return newContext;
);

}

export function withTraceContext<A, U extends unknown[]>(
  correlationId: string,
  fn: (...args: U) => A,
) {
  return (...args: U) => {
    const context = createContextAndSpan(correlationId);
    return api.context.with(context, () => fn(...args));
  };
}

Upvotes: 0

Related Questions