feat: surface OpenTelemetry export errors and config#178
Draft
GregTheGreek wants to merge 1 commit intomainfrom
Draft
feat: surface OpenTelemetry export errors and config#178GregTheGreek wants to merge 1 commit intomainfrom
GregTheGreek wants to merge 1 commit intomainfrom
Conversation
Three small additions to the metric provider init in app.Run that turn
silent OTLP failures into observable startup signals:
1. Register a global otel.SetErrorHandler that routes SDK errors
(export failures, malformed URLs, TLS handshake errors, context
deadlines) to zerolog. Without this the SDK swallows export errors
and the relayer prints nothing.
2. Log the configured collector URL at INFO before initializing the
provider, so container logs show the target the relayer is
actually using.
3. Treat an empty OpenTelemetryCollectorURL as "telemetry disabled"
and substitute a noop.MeterProvider with a WARN log, instead of
the current behavior where url.Parse("") succeeds and the OTLP
HTTP exporter silently falls back to localhost:4318.
No behavior change when the collector URL is set correctly.
Co-Authored-By: Claude
|
Go Test coverage is 36.9 %\ ✨ ✨ ✨ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three small additions to
app.Runthat turn silent OTLP failures into observable startup signals.Why
Investigating "metrics not appearing in Grafana" today required guessing because the relayer prints nothing about telemetry at startup. Three sharp edges combined into one black box:
1. The OTel SDK silently swallows export errors
The metric SDK uses a global error handler. When none is registered, export failures (connection refused, TLS handshake, 404 on wrong path, context deadline exceeded) are dropped. Registering a handler that routes errors through zerolog makes container logs the source of truth.
2. The configured URL is invisible
Container logs say nothing about the OTel target. There is no way from logs to tell whether
OpenTelemetryCollectorURLis empty, malformed, or pointing at an unreachable host. A single INFO log on startup fixes this.3. An empty URL silently exports to localhost
InitMetricProvidercallsurl.Parse(agentURL), which returns no error for"".otlpmetrichttp.WithEndpoint("")then falls back to the OTLP HTTP default oflocalhost:4318. The relayer starts up clean and ships metrics into a void. Treating the empty case as "telemetry disabled" with anoop.MeterProviderand a WARN log makes misconfiguration loud while keeping local-dev simple (no collector required).Behavior change
localhost:4318. Now usesnoop.NewMeterProvider(), which has zero overhead and emits a single WARN at startup.Test plan
go build ./...cleango vet ./app/...cleanSYG_RELAYER_OPENTELEMETRYCOLLECTORURL: confirm WARN logs and the relayer runs cleanly with no metric export attemptsRelated
Pairs with #177 (declare seconds unit on duration histograms) which fixes the bucket-view mismatch for the same set of histograms. Together they make the staging metrics pipeline both observable from logs and meaningful in Grafana.