Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify values for code.function.name and code.namespace #1677

Open
6 tasks
SylvainJuge opened this issue Dec 12, 2024 · 12 comments
Open
6 tasks

Clarify values for code.function.name and code.namespace #1677

SylvainJuge opened this issue Dec 12, 2024 · 12 comments

Comments

@SylvainJuge
Copy link
Contributor

SylvainJuge commented Dec 12, 2024

This is part of #1599 where we aim to make code.* attributes as release-candidate.

Some of the code.* attributes are being renamed with #1624, but this issue is about the values of those attributes and having a clear definition for them, in particular:

  • code.function.name (previously code.function).
  • code.namespace

In the discussion of #1624 we found that not having explicit per-language examples leaves interpretation open (here and here), and is likely to cause ambiguity and potential inconsistencies, also there might be per-language constraints or overhead to provide those values.

Those attributes are currently in experimental, and we aim to promote them to release candidate with #1599, as per this comment we consider that they are not used enough to justify a migration plan to prevent unexpected breaking changes (for example with OTEL_SEMCONV_STABILITY_OPT_IN environment variable).

Here the goal would be to define for each language:

  • the value of code.function.name
  • the value of code.namespace
  • what value to use when only a single value is provided by the language/platform, in other words when to split and how
  • what value to use when only a partial value is available, for example anonymous lambdas/functions that might not have an explicit name in code (but likely have a "technical name").

For now, we aim to maximize consistency, however if the overhead to provide those values (for example due to extra allocation or string splitting overhead), it might be possible to provide "slightly inconsistent values" to avoid those.

Checklist

@pellared
Copy link
Member

pellared commented Dec 18, 2024

Wouldn't it be better/simpler to just have one attribute code.function.name which value is a fully-qualified name (or "full name") and get rid of code.namespace? It would solve both issues:

  • what value to use when only a single value is provided by the language/platform, in other words when to split and how
  • what value to use when only a partial value is available, for example anonymous lambdas/functions that might not have an explicit name in code (but likely have a "technical name").

Maybe we should name such attribute code.function.full_name or code.function.fullname

Related comment: #1624 (comment)

@SylvainJuge
Copy link
Contributor Author

Having a "do it all" attribute might make sense, but I think that we need to first gather what would work best for each platform as separate fields, then we might merge them if that's a better approach. Here there might not be a single "best solution" as the values are platform/language dependent so choosing any is always a compromise.

@SylvainJuge
Copy link
Contributor Author

For Java, what I think are the expected values for those attributes is the following as captured with this gist using the reflection API.

-- regular class
code.namespace = com.mycompany.MyClass
code.function.name = myMethod

-- anonymous class
code.namespace = com.mycompany.Main$1
code.function.name = myMethod

-- primitive type
code.namespace = int
code.function.name = n/a, not available for primitive types

-- lambda
code.namespace = com.mycompany.Main$$Lambda/0x0000748ae4149c00
code.function.name = myMethod

When using the reflection API, the values for code.namespace and code.function.name are each provided through a single method call, so no extra overhead nor processing is required.

Within bytebuddy advices, we can also get those through @Advice.Origin annotation, either as separate values or combined in a single one. When outside of instrumentation advices, for example with "inferred spans" which are generated by a sampling profiler in Java, an equivalent name will have to be captured and thus might require some minimal string processing.

So in the case of Java, I think that having separate attributes is probably the best option.

@xrmx
Copy link
Contributor

xrmx commented Jan 7, 2025

For Python we are sending the following attributes in logs:

        attributes[SpanAttributes.CODE_FILEPATH] = record.pathname
        attributes[SpanAttributes.CODE_FUNCTION] = record.funcName
        attributes[SpanAttributes.CODE_LINENO] = record.lineno

Values are taken from python logging.LogRecord and in practice they would look like:

code.lineno: 42
code.function: test_log_record_user_attributes # this is the name of a method 
code.filepath: path/to/test_handler.py

@trask
Copy link
Member

trask commented Jan 8, 2025

@open-telemetry/dotnet-approvers
@open-telemetry/cpp-approvers
@open-telemetry/erlang-approvers
@open-telemetry/go-approvers
@open-telemetry/javascript-approvers
@open-telemetry/php-approvers
@open-telemetry/rust-approvers
@open-telemetry/swift-approvers

could you help us out and post a common example(s) of what would be most common/expected to capture in your language for code.namespace and code.function? thanks!

@brettmc
Copy link

brettmc commented Jan 8, 2025

For PHP, we extensive use these for auto-instrumentation of a function or method call. For methods, code.namespace is the FQN of the class that the method belongs to. It may be blank for a global or built-in function.
code.function represents the function (or method) that was instrumented.

From one of our tests, GuzzleHttp\Client::transfer() was instrumented here:

["code.function"]=> string(8) "transfer"
["code.namespace"]=> string(17) "GuzzleHttp\Client"

I don't see any issue with us merging these into a single field.

@bryannaegele
Copy link
Contributor

bryannaegele commented Jan 8, 2025

Erlang/Elixir would be fully qualified module name for namespace and function/arity for function name.

Elixir example:

OpenTelemetry.Ctx.new()
Namespace: OpenTelemetry.Ctx
Name: new/0

Erlang

opentelemetry_ctx:new()
Namespace: opentelemetry_ctx
Name: new/0

@trentm
Copy link
Contributor

trentm commented Jan 8, 2025

For Node.js/JavaScript:

tl;dr

There isn't any current normative usage in current OTel JS, so this is just my opinion. :)

Examples:

code.file.path: /Users/trentm/tmp/go-boom.js  (or file:///Users/trentm/tmp/go-boom.mjs)
code.function.name: foo  (or MyClass.mymethod)
code.line.number: 16
code.column.number: 9

or perhaps that MyClass.method would be split into:

code.namespace: MyClass
code.function.name: foo

more details

(Sorry this got long.)

Current state: There is only one instrumentation in opentelemetry-js-contrib.git that is using code.* semconv values: instrumentation-cucumber here. However, I think this usage should be considered an outlier or at least not establish a norm.

If OTel instrumentation were to collect code location information, I think it would be from an Error stack. For example, the at foo (/Users/trentm/tmp/go-boom.js:2:9) line in the following short example:

$ cat go-boom.js
function foo() {
  throw new Error('boom');
}
foo();

$ node go-boom.js
...
Error: boom
    at foo (/Users/trentm/tmp/go-boom.js:2:9)
    at Object.<anonymous> (/Users/trentm/tmp/go-boom.js:4:1)
    at Module._compile (node:internal/modules/cjs/loader:1469:14)
...

In general in JavaScript, err.stack is not standardized but is basically always there as a string. Runtimes using v8 (e.g. Node.js) can setup the Error global object to collect a structured stack trace as described by: https://v8.dev/docs/stack-trace-api#customizing-stack-traces

So, theoretically the OTel JS SDK could install a custom Error.prepareStackTrace ...

... something like this modified "go-boom.js"
// CallSite API from v8: https://v8.dev/docs/stack-trace-api#customizing-stack-traces
const orig = Error.prepareStackTrace ?? (() => {});
Error.prepareStackTrace = function (err, stack) {
  const callsite = stack[0];
  console.log('--');
  console.log('code.file.path:', callsite.getFileName());
  console.log('code.function.name:', callsite.getFunctionName());
  console.log('code.line.number:', callsite.getLineNumber());
  console.log('code.column.number:', callsite.getColumnNumber());
  console.log('--');

  return orig(err, stack);
}

function foo() {
  throw new Error('boom');
}
foo();

the result of which would be:

% node go-boom.js
--
code.file.path: /Users/trentm/tmp/go-boom.js
code.function.name: foo
code.line.number: 16
code.column.number: 9
--
...
Error: boom
    at foo (/Users/trentm/tmp/go-boom.js:16:9)
    at Object.<anonymous> (/Users/trentm/tmp/go-boom.js:18:1)
...

This example uses the older CommonJS module system. With the newer ES Modules system that callsite.getFilePath() becomes a URL:

% node go-boom.mjs
--
code.file.path: file:///Users/trentm/tmp/go-boom.mjs
code.function.name: foo
code.line.number: 16
code.column.number: 9
--
...

So, in general, the typical code.file.path will be a local file path or a URL.

code.namespace

When using classes:

class Foo {
  bar() {
    throw new Error('here');
  }
}
const inst = new Foo();
inst.bar();

Perhaps we'd use code.namespace for the "Foo" class:

% node go-boom.js
--
code.file.path: /Users/trentm/tmp/go-boom.js
code.namespace?: Foo
code.function.name: bar
code.line.number: 18
code.column.number: 11
--
...
Error: here
    at Foo.bar (/Users/trentm/tmp/go-boom.js:18:11)
    at Object.<anonymous> (/Users/trentm/tmp/go-boom.js:22:6)
...

Another debate point would be when JavaScript bundlers are in play -- where multiple files/modules are merged into built one. However, I would expect sourcemaps would (sometimes) resolve back to the source filename.

@pellared
Copy link
Member

pellared commented Jan 8, 2025

Go:

-- regular function
code.namespace = github.com/my/repo/pkg
code.function.name = foo

-- anonymous function (inside foo function)
code.namespace = github.com/my/repo/pkg.foo
code.function.name = func5 // or other funcN generated by the compiler where N is a positive integer

I it worth mentioning that the Go operates on fully qualified names (e.g. github.com/my/repo/pkg.foo.func5) and we have to manually split a "fully qualified function name":

// splitFuncName splits package path-qualified function name into
// function name and package full name (namespace). E.g. it splits
// "github.com/my/repo/pkg.foo" into
// "foo" and "github.com/my/repo/pkg".
func splitFuncName(f string) (funcName, pkgName string) {
	i := strings.LastIndexByte(f, '.')
	if i < 0 {
		return "", ""
	}
	return f[i+1:], f[:i]
}

@intuibase
Copy link

For PHP, we extensive use these for auto-instrumentation of a function or method call. For methods, code.namespace is the FQN of the class that the method belongs to. It may be blank for a global or built-in function. code.function represents the function (or method) that was instrumented.

From one of our tests, GuzzleHttp\Client::transfer() was instrumented here:

["code.function"]=> string(8) "transfer"
["code.namespace"]=> string(17) "GuzzleHttp\Client"

I would like to add that for PHP there might be few cases:

For a function defined in an anonymous namespace:
code.function: FunctionName

For a function defined in a named namespace:
code.function: Namespace\FunctionName

In this case, code.namespace will always be missing. In my opinion, this reveals an inconsistency in the interpretation of what constitutes a "namespace" versus a "function". This is due to the way PHP stores function names - introducing a split would create additional overhead.

For a class method defined in an anonymous namespace:

code.namespace: ClassName
code.function: MerhodName

For a class method defined in a named namespace:

code.namespace: Namespace\ClassName
code.function: MerhodName

@SylvainJuge
Copy link
Contributor Author

@intuibase do we have an idea of the overhead that splitting would cause here ? For Go the conclusion was that it is negligible, but not zero. The original intent here is to favor consistency, but if the overhead becomes too important then we could keep some per-platform inconsistencies and document them.

@intuibase
Copy link

@SylvainJuge It all depends on the application, but in the real world, the overhead should be imperceptible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants