feat(tracing): Add Store, P2P and Config tracing #2972

chatton · 2026-01-13T15:24:06Z

Overview

Adding additional tracing within NewServiceHandler

Add OpenTelemetry tracing for all RPC server methods: - ExecutorService: InitChain, GetTxs, ExecuteTxs, SetFinal - StoreService: GetBlock, GetState, GetMetadata, GetGenesisDaHeight, GetP2PStoreInfo - P2PService: GetPeerInfo, GetNetInfo - ConfigService: GetNamespace, GetSignerInfo All implementations follow decorator pattern with comprehensive test coverage. Tracer names: ev-node/{service}-service

gemini-code-assist · 2026-01-13T15:24:26Z

Summary of Changes

Hello @chatton, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the observability of the system by integrating OpenTelemetry for distributed tracing. It introduces tracing capabilities for the Store, P2P, and Config RPC services, allowing for detailed insights into their execution flow and performance. The tracing can be toggled via configuration, providing flexibility for different environments.

Highlights

OpenTelemetry Integration: Introduced OpenTelemetry for distributed tracing across key RPC services.
Service Tracing: Added tracing to Store, P2P, and Config RPC service handlers.
Conditional Tracing: Tracing is conditionally enabled based on the config.Instrumentation.IsTracingEnabled() flag.
Detailed Span Attributes: Spans for RPC calls capture relevant request and response attributes (e.g., block height, hash, peer count, namespace) and record errors.
Dependency Updates: Updated go.mod and go.sum to include OpenTelemetry and testify dependencies.
Comprehensive Testing: New unit tests ensure the correct functionality and attribute capturing of the tracing wrappers.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

claude · 2026-01-13T15:24:55Z

Claude finished @chatton's task in 2m 27s —— View job

Code Review Complete

I've performed a comprehensive code review of this PR, which adds OpenTelemetry tracing to the Store, P2P, and Config RPC services. Overall, this is a high-quality implementation with good test coverage and a clean decorator pattern. However, there are several issues that need attention.

Critical Issues

1. Missing Per-Store Attributes in `GetP2PStoreInfo` 🔴

Location: pkg/rpc/server/tracing.go:140-158

The GetP2PStoreInfo method only records the count of stores but doesn't capture detailed attributes about each store (height, label, etc.). The test at line 315 of tracing_test.go also doesn't verify these attributes, which would have caught this gap.

Impact: Reduced observability for P2P store debugging and monitoring.

Recommendation: Add per-store attributes like:

span.SetAttributes(
    attribute.Int("store_count", len(res.Msg.Stores)),
)
for i, store := range res.Msg.Stores {
    span.SetAttributes(
        attribute.String(fmt.Sprintf("store_%d_label", i), store.Label),
        attribute.Int64(fmt.Sprintf("store_%d_height", i), int64(store.Height)),
    )
}

Related Test Gap: Update TestTracedStoreService_GetP2PStoreInfo_Success to verify these attributes.

Fix this →

Moderate Issues

2. Code Duplication in Error Handling 🟡

Location: Throughout pkg/rpc/server/tracing.go (lines 50-53, 80-83, 108-111, etc.)

Every traced method repeats this error handling pattern:

res, err := t.inner.Method(ctx, req)
if err != nil {
    span.RecordError(err)
    span.SetStatus(codes.Error, err.Error())
    return nil, err
}

Impact: Maintenance burden - any change to error handling requires updates in ~10 places.

Recommendation: Extract into a generic helper function. Example:

func traceUnary[Req, Res any](
    tracer trace.Tracer,
    ctx context.Context,
    spanName string,
    req *connect.Request[Req],
    call func(context.Context, *connect.Request[Req]) (*connect.Response[Res], error),
    attrs ...attribute.KeyValue,
) (*connect.Response[Res], error) {
    ctx, span := tracer.Start(ctx, spanName, trace.WithAttributes(attrs...))
    defer span.End()
    
    res, err := call(ctx, req)
    if err != nil {
        span.RecordError(err)
        span.SetStatus(codes.Error, err.Error())
        return nil, err
    }
    return res, nil
}

This would significantly reduce boilerplate across all traced methods.

Fix this →

3. Inconsistent Attribute Recording 🟡

Location: pkg/rpc/server/tracing.go:32-69 (GetBlock method)

The GetBlock method sets initial attributes differently than it sets response attributes. Initial attributes are set before span creation, while response attributes are set after the inner call. This inconsistency makes the code harder to follow.

Current approach:

Request attributes: Set before span creation
Response attributes: Set after inner call

Recommendation: Maintain consistency - either set all attributes after the call, or clearly separate request vs response attributes with prefixes like request.height and response.block_size_bytes.

4. Missing Nil Check Safety 🟡

Location: pkg/rpc/server/tracing.go:59-68

While there's a nil check for res.Msg.Block, there's no explicit check for res.Msg itself. Although this is likely guaranteed by the framework, defensive programming would add the check.

Recommendation:

if res.Msg != nil && res.Msg.Block != nil && res.Msg.Block.Data != nil {
    // ... existing logic
}

Minor Issues & Observations

5. Test Coverage Gaps ℹ️

Test Coverage: 82.69% (per Codecov report)

Missing coverage includes:

Error scenarios for some methods (only GetBlock, GetPeerInfo, GetNamespace, and GetSignerInfo have error tests)
Edge cases like empty stores, nil responses, etc.

Recommendation: Add error test cases for:

GetState
GetMetadata
GetGenesisDaHeight
GetP2PStoreInfo
GetNetInfo

6. Conditional Tracing Pattern ℹ️

Location: pkg/rpc/server/server.go:385-389

The conditional wrapping approach is clean:

if config.Instrumentation.IsTracingEnabled() {
    storeServer = WithTracingStoreServer(storeServer)
    // ...
}

Positive: This is excellent! It allows zero-overhead when tracing is disabled and maintains clean separation of concerns using the decorator pattern.

7. Hex Encoding Consistency ✅

Location: Throughout tracing.go

Good consistent use of hex.EncodeToString() for binary data (hashes, addresses) in trace attributes. This makes the data readable in tracing UIs.

8. Import Organization ✅

Location: pkg/rpc/server/tracing.go:3-16

Imports are well-organized with standard library, third-party, and internal packages properly separated.

Security & Performance

Security ✅

No sensitive data exposure: Reviewed all attribute recordings - no private keys, credentials, or sensitive user data logged
Hash encoding: Binary hashes properly hex-encoded, preventing binary injection in trace backends
No injection risks: All string interpolation uses proper formatting

Performance ✅

Zero overhead when disabled: Conditional wrapping means no performance impact when tracing is off
Efficient attribute recording: Attributes are recorded inline without unnecessary allocations
Deferred span end: Proper use of defer span.End() ensures spans are closed even on errors

Minor consideration: The block size calculation (lines 60-67) iterates over all transactions. For very large blocks, this could add latency. Consider adding an attribute like calculate_block_size_ms to track this overhead, or make it conditional based on tx count.

Architecture & Design

Decorator Pattern ✅

The implementation correctly uses the decorator pattern:

Each traced server wraps the inner implementation
Clean separation between tracing logic and business logic
Easy to compose with other decorators if needed

Tracer Initialization ✅

Good use of otel.Tracer() with descriptive names:

"ev-node/store-service"
"ev-node/p2p-service"
"ev-node/config-service"

This allows filtering and grouping in trace backends.

Testing Strategy ✅

Excellent testing approach:

Mock implementations with injectable functions
SpanRecorder for verifying trace output
Helper functions (setupStoreTrace, requireAttribute) reduce test boilerplate
Tests verify both successful and error cases

Documentation

Missing Documentation 📝

Location: pkg/rpc/server/tracing.go

The file lacks:

Package-level documentation explaining the tracing approach
Comments on the traced server structs explaining their purpose
Function comments on public constructor functions

Recommendation: Add:

// Package server provides RPC server implementations with optional OpenTelemetry tracing.
//
// Tracing is implemented using the decorator pattern. Each service (Store, P2P, Config)
// has a traced variant that wraps the original implementation and emits OpenTelemetry spans.

// tracedStoreServer decorates a StoreServiceHandler with OpenTelemetry spans.
// Each RPC method creates a span with relevant request/response attributes.
type tracedStoreServer struct { ... }

// WithTracingStoreServer decorates the provided store service handler with tracing spans.
// This should be applied when config.Instrumentation.IsTracingEnabled() returns true.
func WithTracingStoreServer(inner v1connect.StoreServiceHandler) v1connect.StoreServiceHandler { ... }

Summary & Recommendations

✅ Strengths

Clean decorator pattern implementation
Good test coverage with well-structured tests
Proper error recording in traces
Zero overhead when disabled
No security issues identified
Consistent attribute naming

🔧 Must Fix

Add per-store attributes in GetP2PStoreInfo (Critical for observability)
Add corresponding test assertions (Would have caught ABCI interface #1)

🎯 Should Fix

Extract error handling into generic helper (reduces duplication)
Add error test cases for remaining methods
Add package and function documentation

💡 Consider

Make block size calculation conditional for large blocks
Add attribute prefixes for request vs response data
Add defensive nil checks

Test Execution Note

I don't have permissions to run the test suite directly. Please ensure:

make test           # Run unit tests
make lint           # Verify linting passes
make test-cover     # Check coverage remains above threshold

| Branch: tracing-part-6

github-actions · 2026-01-13T15:26:00Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Jan 14, 2026, 2:21 PM

gemini-code-assist

Code Review

This pull request introduces OpenTelemetry tracing for the Store, P2P, and Config services. The implementation uses a decorator pattern to wrap the existing service handlers, which is a good approach. The changes are well-tested, with a new test file covering the tracing logic. I've identified a few areas for improvement: a bug in attribute key generation, an opportunity to simplify the code in one of the traced methods, a suggestion to reduce code duplication across the tracing decorators, and a gap in the test coverage. Overall, this is a great addition for observability.

gemini-code-assist · 2026-01-13T15:26:28Z

pkg/rpc/server/tracing.go

+	)
+	for i, store := range res.Msg.Stores {
+		span.SetAttributes(
+			attribute.Int64("store_"+string(rune(i))+"_height", int64(store.Height)),


There's a bug in how the attribute key is constructed. string(rune(i)) will convert the integer index i into its corresponding Unicode character, not its string representation. For example, for i=0, it will produce a null character (\x00), which is likely not what you intend and may cause issues with tracing backends.

You should use strconv.Itoa(i) to convert the integer to a string. This will require adding an import for the strconv package.

attribute.Int64("store_"+strconv.Itoa(i)+"_height", int64(store.Height)),

gemini-code-assist · 2026-01-13T15:26:28Z

pkg/rpc/server/tracing.go

+	var height uint64
+	switch identifier := req.Msg.Identifier.(type) {
+	case *pb.GetBlockRequest_Height:
+		height = identifier.Height
+	case *pb.GetBlockRequest_Hash:
+		// for hash-based queries, we'll add the hash as an attribute
+	}
+
+	ctx, span := t.tracer.Start(ctx, "StoreService.GetBlock",
+		trace.WithAttributes(
+			attribute.Int64("height", int64(height)),
+		),
+	)
+	defer span.End()
+
+	// add hash attribute if hash-based query
+	if hashIdentifier, ok := req.Msg.Identifier.(*pb.GetBlockRequest_Hash); ok {
+		span.SetAttributes(attribute.String("hash", hex.EncodeToString(hashIdentifier.Hash)))
+	}


The logic for setting span attributes based on the request identifier can be simplified. Currently, you're using a switch to get the height, and then a separate if to get the hash, which involves checking the type of req.Msg.Identifier twice. This can be done more cleanly in a single switch statement that builds the attributes before starting the span. This will make the code more readable and efficient.

var attrs []attribute.KeyValue switch identifier := req.Msg.Identifier.(type) { case *pb.GetBlockRequest_Height: attrs = append(attrs, attribute.Int64("height", int64(identifier.Height))) case *pb.GetBlockRequest_Hash: if identifier.Hash != nil { attrs = append(attrs, attribute.String("hash", hex.EncodeToString(identifier.Hash))) } } ctx, span := t.tracer.Start(ctx, "StoreService.GetBlock", trace.WithAttributes(attrs...)) defer span.End()

gemini-code-assist · 2026-01-13T15:26:28Z

pkg/rpc/server/tracing.go

+	res, err := t.inner.GetState(ctx, req)
+	if err != nil {
+		span.RecordError(err)
+		span.SetStatus(codes.Error, err.Error())
+		return nil, err
+	}


This block of code for handling errors from the inner service call and updating the span is repeated in almost every traced method in this file. To improve maintainability and reduce boilerplate, consider extracting this logic into a generic helper function.

For example, you could create a function that takes the context, request, the inner function call, and returns the response and error, handling the span creation and error recording internally. Here's a conceptual example:

func traceUnary[Req, Res any]( tracer trace.Tracer, ctx context.Context, spanName string, req *connect.Request[Req], call func(context.Context, *connect.Request[Req]) (*connect.Response[Res], error), // ... other params for attributes ) (*connect.Response[Res], error) { ctx, span := tracer.Start(ctx, spanName) defer span.End() res, err := call(ctx, req) if err != nil { span.RecordError(err) span.SetStatus(codes.Error, err.Error()) return nil, err } // ... set success attributes return res, nil }

Applying this pattern would make the tracing decorators much more concise.

gemini-code-assist · 2026-01-13T15:26:28Z

pkg/rpc/server/tracing_test.go

+	attrs := span.Attributes()
+	requireAttribute(t, attrs, "store_count", 2)


This test is missing assertions for the per-store attributes (store_0_height, store_1_height). The current test only checks for store_count. Adding assertions for individual store heights would have caught the attribute key generation bug in GetP2PStoreInfo and would make the test more robust.

attrs := span.Attributes() requireAttribute(t, attrs, "store_count", 2) requireAttribute(t, attrs, "store_0_height", int64(100)) requireAttribute(t, attrs, "store_1_height", int64(99))

codecov · 2026-01-13T15:29:16Z

Codecov Report

❌ Patch coverage is 82.23684% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.19%. Comparing base (8cd52bd) to head (d63f852).

Files with missing lines	Patch %	Lines
pkg/rpc/server/tracing.go	84.13%	18 Missing and 5 partials ⚠️
pkg/rpc/server/server.go	42.85%	3 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2972      +/-   ##
==========================================
+ Coverage   57.82%   58.19%   +0.37%     
==========================================
  Files          98       99       +1     
  Lines        9394     9543     +149     
==========================================
+ Hits         5432     5554     +122     
- Misses       3358     3379      +21     
- Partials      604      610       +6

Flag	Coverage Δ
combined	`58.19% <82.23%> (+0.37%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

chatton added 30 commits January 6, 2026 14:44

wip: adding tracing

78b54f2

chore: only having the first tracing decorator

ae87a59

chore: remove comment

fdd943b

deps: adding pin to genproto version

74d4a5f

chore: ensuring errors reported, adding unit tests

d796589

chore: add check to validate basic

d52f187

chore: modified default

b2b3218

chore: adding logging of possible error

41bce54

chore: updated flag test

fd34073

chore: bump endpoint to correct port

fd7425a

wip: adding propagating client to engine and eth client

f30a577

chore: simplify construction of rpc opts

570509b

chore: address PR feedback

caa0684

chore: ensure consistent propagation settings

c154f23

chore: adding interface for engine client and tracing implementation

607f4a3

chore: mrege main

c5d7c41

chore: refactored wiring to use bool

80e2b17

chore: tidy all fix

07a45b6

chore: fix go mod conflicts

423bb15

chore: addressing PR feedback

3e373ce

chore: adding eth client tracing

ed217d7

chore: merge main

17eb5aa

chore: add payload id as attribute

931d2ac

chore: handle merge conflicts

ee2d158

Merge branch 'cian/add-tracing-part-3' into cian/add-tracing-part-4

8d7fc84

chore: merge main

32db6c8

chore: adding tracing for DA client

a3fa329

chore: add *.test to gitignore

776a2ea

chore: updated test

d3bdb1e

chore: adding hex encoded namespace

4d1d3ff

chatton added 6 commits January 13, 2026 10:30

chore: make tidy all

b1a827e

chore: tidy all and add wrappers

30edc0c

chore: merge main

1ff13f3

chore: merge part 5

9d726af

chore: removed unused tracer

db319d2

github-actions bot assigned chatton Jan 13, 2026

gemini-code-assist bot reviewed Jan 13, 2026

View reviewed changes

chore: remove unused attribute

2edad3e

Base automatically changed from cian/add-tracing-part-5 to main January 13, 2026 16:14

chatton added 4 commits January 14, 2026 08:31

Merge branch 'main' into tracing-part-6

b1aaa84

chore: tidy all

3ab312c

Merge branch 'main' into tracing-part-6

fecea22

chore: addressing PR feedback

d63f852

chatton marked this pull request as ready for review January 14, 2026 15:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(tracing): Add Store, P2P and Config tracing #2972

feat(tracing): Add Store, P2P and Config tracing #2972

chatton commented Jan 13, 2026

Uh oh!

gemini-code-assist bot commented Jan 13, 2026

Uh oh!

claude bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 13, 2026

Uh oh!

gemini-code-assist bot Jan 13, 2026

Uh oh!

gemini-code-assist bot Jan 13, 2026

Uh oh!

gemini-code-assist bot Jan 13, 2026

Uh oh!

codecov bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		attrs := span.Attributes()
		requireAttribute(t, attrs, "store_count", 2)

feat(tracing): Add Store, P2P and Config tracing #2972

Are you sure you want to change the base?

feat(tracing): Add Store, P2P and Config tracing #2972

Conversation

chatton commented Jan 13, 2026

Overview

Uh oh!

gemini-code-assist bot commented Jan 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

claude bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Complete

Critical Issues

1. Missing Per-Store Attributes in GetP2PStoreInfo 🔴

Moderate Issues

2. Code Duplication in Error Handling 🟡

3. Inconsistent Attribute Recording 🟡

4. Missing Nil Check Safety 🟡

Minor Issues & Observations

5. Test Coverage Gaps ℹ️

6. Conditional Tracing Pattern ℹ️

7. Hex Encoding Consistency ✅

8. Import Organization ✅

Security & Performance

Security ✅

Performance ✅

Architecture & Design

Decorator Pattern ✅

Tracer Initialization ✅

Testing Strategy ✅

Documentation

Missing Documentation 📝

Summary & Recommendations

✅ Strengths

🔧 Must Fix

🎯 Should Fix

💡 Consider

Test Execution Note

Uh oh!

github-actions bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude bot commented Jan 13, 2026 •

edited

Loading

1. Missing Per-Store Attributes in `GetP2PStoreInfo` 🔴

github-actions bot commented Jan 13, 2026 •

edited

Loading

codecov bot commented Jan 13, 2026 •

edited

Loading