Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OpenTelemetryService initialization+observables code and fix the span closure of Chat Spans #6807

Merged
merged 3 commits into from
Jan 28, 2025

Conversation

arafatkatze
Copy link
Contributor

@arafatkatze arafatkatze commented Jan 27, 2025

Changes

  1. Fixing problem with externalAuthRefreshChanges not having initial value which was blocking the configuration
  2. Making caching checks more correct
  3. Makes sure the spans for chat are properly closed so that we can actually see them in HoneyComb
  4. Fixing different edge cases in the re-registration of OtelCollectors with observables

Span Closure Issues

image This PR ensures that the spans for handling chat messages are properly ended, so that they are properly finished and exported. This way, they can be seen in the Trace View(I have verified this behavior by testing locally too) so now honeycomb should get proper chat traces.

NOTE: This behavior was working perfectly in the original PR but in this massive refactor the span closure got accidentally removed so now I am adding those span closures again and adding a comment for future safety.

OpenTelemetry Configuration Issues

This PR resolves configuration update issues in the OpenTelemetry integration by implementing a provider recycling pattern, ensuring stable tracing across dynamic configuration changes while preventing API registration conflicts.
In the past we would re-register on this.tracerProvider on opentelemetry changes but that lead to a lot of oTel errors like

"OTLPExporterError: Unprocessable Entity
\tat IncomingMessage.<anonymous> (/Users/pkukielka/Work/sourcegraph/cody2/vscode/dist/extension.node.js:183479:31)
\tat IncomingMessage.emit (node:events:530:35)
\tat IncomingMessage.emit (node:domain:489:12)
\tat endReadableNT (node:internal/streams/readable:1698:12)
\tat process.processTicksAndRejections (node:internal/process/task_queues:82:21)"


Or alternatively some form of re-registration errors so this PR registers TracerProvider only ones but it changes the processors so that they can account for different conditions like change in telemetry sending URL and enable/disable etc.

Test Plan

NOTE: Every operation in the test plan like turning false to true or changing the account requires a wait time of 5 seconds atleast coz observables take time to "propagate".

Test 1: Tracing Flag Toggle Without Restart

Steps:

  • Set "cody.experimental.tracing": true in settings.json of the vscode config (no restart).

  • Ask a chat question.

  • Verify: CodyTraceExporter::export is executed (check debugger).

  • Verify: No otlp_export_error in debug console.

  • Set cody.experimental.tracing to false (no restart).

  • Ask a chat question.

  • Verify: CodyTraceExporter::export is not executed and blocked by trace enablement check.

  • Verify: No errors in console.

  • Set cody.experimental.tracing to true again (no restart).

  • Ask a chat question.

  • Verify: CodyTraceExporter::export executes and emits data (passes trace enablement check).

  • Verify: No otlp_export_error.

Test 2: Account Switching with Tracing

Steps:

Account A Setup (sourcegraph.com):

  • Set "cody.experimental.tracing": true in config.

  • Ask a chat question.

  • Verify: CodyTraceExporter::export executes.

  • Verify: No errors.

Switch to Account B (sourcegraph.sourcegraph.com):(No need to restart cody)

  • Ask a chat question.

  • Verify: Tracing is enabled by default in the new account; CodyTraceExporter::export is called.

  • Verify: No errors.

  • Ask a chat question.

  • Verify: CodyTraceExporter::export executes for Account B.

  • See the export function for spans being hit

Test 3: Error Handling Validation

Steps:

  • Across all test cases above, explicitly check the debug console for:

  • Absence of otlp_export_error or other tracing-related errors.

  • No unexpected authentication or network errors during account switches.

@arafatkatze arafatkatze marked this pull request as ready for review January 27, 2025 04:56
@arafatkatze arafatkatze changed the title Ara/fix otel observables Fix OpenTelemetryService initialization and observables code Jan 27, 2025
@dominiccooney
Copy link
Contributor

Let's wait until @umpox has stabilized main before we look at this.

@arafatkatze arafatkatze changed the title Fix OpenTelemetryService initialization and observables code Fix OpenTelemetryService initialization+observables code And also fix the span closure of Chat Spans Jan 27, 2025
@arafatkatze arafatkatze changed the title Fix OpenTelemetryService initialization+observables code And also fix the span closure of Chat Spans Fix OpenTelemetryService initialization+observables code and fix the span closure of Chat Spans Jan 27, 2025
@umpox
Copy link
Contributor

umpox commented Jan 27, 2025

@dominiccooney @arafatkatze Main should be stable now, please rebase this onto the latest main commit

@arafatkatze arafatkatze force-pushed the ara/fix-otel-observables branch 2 times, most recently from c77e29e to a97768f Compare January 28, 2025 05:29
@arafatkatze arafatkatze force-pushed the ara/fix-otel-observables branch from a97768f to d1f7a69 Compare January 28, 2025 05:30
Copy link
Contributor

@pkukielka pkukielka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arafatkatze arafatkatze merged commit 8a23bad into main Jan 28, 2025
21 checks passed
@arafatkatze arafatkatze deleted the ara/fix-otel-observables branch January 28, 2025 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants